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ABSTRACT 


The  current  state  of  the  art  techniques  of  risk  assessment  rely  on  checklists  and 
human  expertise.  This  constitutes  a  weak  approach  because  different  people  could  arrive 
at  different  conclusions  from  the  same  scenario.  The  difficulty  on  estimating  the  duration 
of  projects  applying  evolutionary  software  processes  contributes  to  add  intricacy  to  the 
risk  assessment  problem.  This  thesis  introduces  a  formal  method  to  assess  the  risk  and  the 
duration  of  software  projects  automatically.  The  method  has  been  designed  according  the 
characteristics  of  evolutionary  software  processes  such  as  productivity,  requirement 
volatility  and  complexity.  The  formal  model  based  on  these  three  indicators  estimates  the 
duration  and  risk  of  evolutionary  software  processes.  The  approach  introduces  benefits  in 
two  fields:  a)  automation  of  risk  assessment  and,  b)  early  estimation  method  for 
evolutionary  software  processes. 
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I.  INTRODUCTION 


A.  THE  INMATURITY  OF  SOFTWARE  ENGINEERING 

"Despite  50  years  of  progress,  the  software  industry  remains  years— perhaps 
decades— short  of  the  mature  engineering  discipline  needed  to  meet  the  demands  of  an 
information-age  society."  (Gibbs,  1994).  Many  researches  have  treated  the  problem  using 
different  approaches:  formal  methods,  prototyping,  software  processes,  etc.  However,  in 
the  author's  opinion,  this  assertion  remains  true  today. 

Experience  suggests  that  building  and  integrating  software  by  mechanically 
processable  formal  models  leads  to  cheaper,  faster  and  more  reliable  products  (Luqi, 
1997).  Software  development  processes  such  the  Hypergraph  model  for  software 
evolution  (Luqi,  1997),  or  the  Spiral  model  (Boehm,  1988),  have  improved  the  state  of 
the  art.  However,  in  the  author's  opinion  they  share  a  common  weakness:  risk  assessment. 

In  the  software  evolution  domain,  risk  assessment  has  not  been  addressed  as  part 
of  the  graph  model.  In  the  various  enhancements  and  extensions,  the  graph  model  does 
not  include  risk  assessment  steps,  hence  risk  management  remains  as  a  human-dependent 
activity  that  requires  expertise. 

On  the  evaluation  of  the  spiral  model,  one  of  the  difficulties  mentioned  by  Boehm 
was:  "Relying  on  risk-assessment  expertise.  The  spiral  model  places  a  great  deal  of 
reliance  on  the  ability  of  software  developers  to  identily  and  manage  sources  of  project 
risk."  "...Another  concern  is  that  a  risk-driven  specification  will  also  be  people- 
dependent."  (Boehm,  1988). 
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What  is  the  reason  that  software  engineering  does  not  reach  the  maturity  level  of 
other  forms  of  engineering?  Maybe  is  easier  to  find  the  answer  looking  at  the  differences 
between  software  engineering  and  other  disciplines.  In  the  author's  opinion,  one 
difference  is  that  software  engineering  is  highly  dependent  on  people.  A  second 
difference  is  that  software  engineering  is  younger  (forty  years  versus  centuries  for  civil 
engineering).  The  third  difference  is  that  the  product,  software,  is  intangible.  It  is  difficult 
to  estimate  its  real  value  until  late  in  the  development  process.  All  these  differences  make 
software  development  projects  have  a  great  deal  of  uncertainty. 

Many  researchers  (Boehm,  1898),  (Charette,  1997),  (Gilb,  1988),  (Hall,  1997), 
(Jones,  1994),  (Karolak,  1996),  (SEI,  1996)  have  addressed  the  problem  of  risk 
assessment  following  the  perspective  of  the  traditional  disciplines.  The  tools  for  risk 
assessment  are  guides  of  practices,  checklists,  taxonomies  of  risk  factors  and  few  metrics. 
All  these  methods  work  fine  IF  there  is  a  human  educated  on  risk  assessment  AND  with 
enough  experience.  Such  resources  are  very  scarce.  Maybe  that  is  the  reason  why 
software  engineering  is  still  immature. 


B.  THE  ESTIMATION  PROBLEM 

Since  the  creation  of  the  first  computers,  tremendous  progress  has  been  made  in 
terms  of  hardware.  The  general-purpose  computer  has  been  especially  important  because 
of  its  versatility.  The  stored  program  allows  specialized  applications  created  by  software. 
These  applications  have  grown  in  size  and  complexity  covering  all  kinds  of  human 
activities.  Unfortunately,  the  ability  to  build  software  has  not  followed  the  same  rate  of 
progress  (Hall,  1997.  pp  xv).  Gerald  Weinberg  said  "to  call  software  development  an 
infant  discipline  is  not  a  moral  judgement,  but  merely  a  colorful  way  to  summarize  its 
short  history  and  present  existence. "  (Gilb,  1977.  Foreword).  Software  engineering  is  the 
discipline  that  focuses  on  the  planning,  developing  and  maintaining  software  products.  It 
seems  that  the  creation  of  software  imposes  different  challenges  than  the  creation  of 
hardware.  In  the  previous  section  was  discussed  the  author's  hypotheses  about  this  issue. 
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As  the  range  of  computer  applications  has  grown  as  well  as  their  complexity,  the 
cost  of  software  development  has  become  the  main  cost  component  on  a  system. 
Literature  shows  that  in  the  industry  as  well  as  in  government  environments,  schedule 
and  cost  overruns  are  tragically  common  (Luqi,  1989).  Developing  software  is  still  a 
high-risk  activity.  Despite  the  advances  in  technology  and  computer-aided  software 
engineering  (CASE)  tools,  little  progress  has  been  done  in  improving  the  management  of 
software  development  projects  (Hall,  1997).  The  acquisition  and  development 
communities,  both  governmental  and  industrial,  lack  a  systematic  way  of  identifying, 
communicating  and  resolving  technical  uncertainty  (SEI,  1996).  Research  shows  that  45 
percent  of  all  the  causes  for  delayed  software  deliveries  are  related  to  organizational 
issues  (vanGenuchten,  1991).  Software  is  also  the  main  cost  contribution  factor  in 
computer  systems  (Boehm,  1981),  (Karolak,  1996). 


This  research  is  focused  on  software  project  risk  assessment,  in  other  words,  the 
prediction  of  success  of  the  project.  The  only  way  to  evaluate  the  degree  of  success  of  a 
project  is:  a)  to  compare  the  planned  and  actual  schedules;  b)  to  compare  the  planned  and 
actual  costs;  and  c)  to  compare  the  planned  and  actual  product  characteristics.  An 
emergent  branch  of  software  engineering  has  focused  on  this  last  part:  software 
reliability.  However,  the  author's  thinks  that  more  emphasis  must  be  done  in  the  first  two. 

It  is  know  that  in  software  development,  manpower  and  time  are  not 
interchangeable  (Brooks,  1974).  It  is  also  know  that  productivity  rates  are  highly  variable, 
and  that  function  and  size  are  highly  correlated  with  errors  and  duration  of  the  project 
(Putnam,  1980).  It  is  also  learned  that  during  the  requirements  phase  the  majority  and 
most  costly  errors  are  introduced  (Boehm  1981).  It  is  also  known  that  life  cycle 
manpower  patterns  follow  heavy  tailed  curves  (Putnam,  1980,  1992,  1996,  1997), 
(Boehm,  1981).  It  is  known  some  practices  and  heuristics  that  improve  the  development 
process  (Humphrey,  1989).  There  exist  CASE  tools  that  improve  the  productivity.  There 
exist  macro  models  able  to  estimate  with  different  degree  of  success  the  effort  and 
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duration  of  software  projects  (Albrecht,  1979),  (Boehm,  1981,  2000),  (Putnam,  1997). 
What  it  is  not  available  is  a  model  of  the  internal  phenomenology  of  the  software  life 
cycle.  Without  such  knowledge,  risk  assessment  is  almost  impossible. 


C.  RESEARCH  QUESTIONS 

The  software  process  is  a  set  of  activities  with  dependency  relationships  that  occur 
over  a  certain  period  of  time.  From  this  point  of  view  software  projects  do  not  differ  from 
any  other  type  project.  At  the  beginning  there  exists  a  great  deal  of  uncertainty  that  can  be 
reduced  to  two  types:  time  and  money.  Other  intermediate  metrics  such  as  performance, 
rate  of  errors  and  effort  can  be  converted  to  time  and  cost.  As  time  goes  by  the  level  of 
uncertainty  usually  decreases  as  consequence  of  the  availability  of  information. 
Unfortunately,  the  main  resources  (time  and  budget)  also  experiment  the  same  behavior. 
So  managers,  as  decision-makers,  must  choose  between  making  early  decisions  with  lots 
of  uncertainty  or  postponing  decisions  trading  time  for  information.  This  leads  to  the 
basic  research  question  addressed  in  this  thesis: 

What  are  the  early  automatically  collectable  measures  from  the  software 

process  that  describe  project  risk? 

The  concept  of  early  measure  is  emphatized  because  recognizing  the  risks  in  the 
early  phases  increases  the  probability  of  contingency,  improving  consequently  the 
competitive  advantage.  The  research  focuses  on  automatically  collectable  measures 
because  risk  identification  should  not  impose  significant  extra  workload  and  must  be  as 
objective  as  possible.  And  this  leads  to  the  second  question: 

How  can  these  measures  be  related  in  order  to  assess  project  risk? 


4 


D.  GENERAL  APPROACH 


Despite  the  improvements  introduced  in  software  processes  and  automated  tools, 
risk  assessment  for  software  projects  remains  as  an  unstructured  problem  dependent  on 
human  expertise  (Bohem,  1988),  (Hall,  1997).  It  is  the  author's  intention  to  find  ways  to 
transform  risk  assessment  into  a  structured  problem.  Solving  the  risk  assessment  problem 
with  indicators  measured  on  the  early  phases  would  constitute  a  great  benefit  to  software 
engineering.  It  is  at  that  moment  that  changes  can  be  done  with  less  impact  on  the  budget 
and  schedule.  The  requirements  phase  is  the  crucial  stage  to  assess  risk  because:  a)  it  has 
a  huge  amount  of  human  interaction  and  communication  that  can  be  misunderstood  and 
be  source  of  errors;  b)  errors  introduced  at  this  phase  are  very  expensive  to  fix;  c)  the 
existence  of  generation  tools  diminishes  the  errors  in  the  development  process  if  the 
requirements  are  correct;  and  d)  requirements  evolve  introducing  changes  and 
maintenance  along  the  whole  life  cycle. 

It  is  necessary  to  construct  a  model  to  assess  risk  based  on  measurable  objective 
parameters  that  can  be  automatically  collected  and  analyzed.  One  of  the  goals  of  this 
research  is  to  integrate  a  risk  assessment  model  to  the  previous  research  in  Computer- 
Aided  Prototyping  System  (CAPS)  at  the  Naval  Postgraduate  School.  This  integration  is 
required  in  order  to  capture  metrics  automatically  and  to  provide  project  managers  with  a 
more  complete  tool. 

Software  risk  management  includes  the  identification,  assessment  and  mitigation 
of  risks.  It  requires  dealing  with  complexity  and  to  assign  scarce  resources  in  the  most 
efficient  way.  The  scope  of  this  thesis  is  limited  to  risk  identification  and  risk  assessment. 
In  the  author's  opinion,  it  is  in  these  two  phases  where  an  automated  method  can  provide 
major  impact. 

This  thesis  studies  project  risk  assessment  decomposing  it  into  three  classes: 
resource  risk  assessment,  process  risk  assessment,  and  product  risk  assessment.  There 
exists  a  dependency  between  these  classes  of  risk.  The  successful  use  of  the  resources 
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depends  on  their  own  characteristics  and  in  the  success  of  the  product  and  the  process. 
The  success  of  the  process  depends  on  itself  as  well  as  in  the  success  of  the  resources  and 
the  product.  And  the  success  of  the  product  depends  on  itself  and  on  the  success  of  the 
resources  applied  to  the  process.  The  three  classes  have  strong  coupling  and  in  seem  to  be 
different  facets  of  a  same  entity:  the  project. 

The  measure  of  project  risk  can  be  viewed  as  the  probability  of  developing  the 
required  product  on  the  planned  schedule  and  within  the  budget  with  the  available 
software  process  and  resources.  The  hypothesis  is  that  the  probability  distribution  is  a 
heavy-tailed  distribution  probably  from  the  Weibull  family. 

It  is  necessary  to  create  a  set  of  metrics  customized  to  the  characteristics  of 
software  evolution  including  complexity,  requirements  stability,  personnel  stability  and 
productivity.  The  details  of  the  model  and  the  metrics  are  described  on  Chapter  III.  The 
approach  has  a  fundamental  implication:  in  order  to  assess  risk  it  is  necessary  to  assess 
duration  and  effort. 


E.  SOFTWARE  EVOLUTION  FOCUS 

Studies  have  shown  that  early  parts  of  the  system  development  cycle  such  as 
requirements  and  design  specifications  are  especially  prone  to  errors  (Boehm,  1981). 
Problems  originating  in  the  early  stages  often  have  a  lasting  influence  on  the  reliability, 
safety  and  cost  of  the  system.  Evolutionary  prototyping  offers  an  iterative  approach  to 
requirements  engineering  to  alleviate  the  problems  of  uncertainty,  ambiguity  and 
inconsistency  inherent  in  the  process.  Moreover,  prototyping  can  improve  the  capture  of 
change  in  requirements  and  assumptions  during  the  development  process.  This  effect  is 
particularly  observed  in  projects  involving  multiple  stakeholders  with  different  points  of 
view  (Ramesh,  1 995),  (Conklin,  1 988). 
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Evolutionary  driven  CASE  tools  for  computer-aided  prototyping  provide  logical 
assessment  of  the  consistency  and  clarity  of  requirements  and  specifications.  The  use  of 
prototypes  facilitates  the  requirement  phase  in  any  type  of  software  projects.  Particularly, 
in  real-time  applications  where  severe  time  constraints  impose  more  challenges,  the  use 
of  prototypes  facilitates  to  describe  the  requirements  in  a  clear,  precise,  consistent  and 
executable  format.  Prototypes  can  be  applied  to  demonstrate  system  scenarios  to  the 
affected  parties  as  a  way  to:  a)  collect  criticisms  and  feedback  that  are  sources  for  new 
requirements;  b)  early  detection  of  deviations  from  users'  expectations;  c)  trace  the 
evolution  of  the  requirements;  and  d)  improve  the  communication  and  integration  of  the 
users  and  the  development  personnel. 

The  benefits  of  prototyping  are  unquestionable.  All  modem  life  cycle  models  such 
as  Bohem's  Spiral,  Luqi's  Graph  Model,  Rapid  Application  Development  (RAD),  etc.  are 
based  on  prototyping.  Experience  suggests  that  building  and  integrating  software  by 
mechanically  processable  formal  models  leads  to  cheaper,  faster  and  more  reliable 
products  (Luqi,  1997).  Also,  all  software  development  processes  mentioned  before  rely 
on  human  expertise  to  identify,  assess,  and  control  risk. 

A  second  concern  in  the  use  of  prototypes  is  that  they  impose  a  problem  to  project 
planning  because  the  uncertain  number  of  cycles  required  in  constructing  the  product.  For 
the  most  part  the  project  management  and  estimation  techniques  are  based  on  linear 
layouts  of  activities.  Critical  Path  Method  (CPM)  and  Program  Evaluation  Review 
Technique  (PERT)  are  not  well  suited  to  deal  with  cycles  because  they  are  based  on 
acyclic  digraphs. 


F.  GENERAL  RESEARCH  DESIGN 

The  research  design  of  this  thesis  is  based  on  two  approaches.  First,  an  extended 
literature  research  will  provide  the  basis  for  background  and  theoretical  foundations 
covering  the  following  topics:  software  engineering,  software  reliability,  decision  theory, 
statistics,  probability,  project  management,  and  risk  management. 
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Second,  the  problem  of  risk  assessment  will  be  analyzed  using  causal  analysis  to 
identify  the  risk  factors.  A  model  will  be  constructed,  calibrated  and  validated  in  three 
ways:  a)  internal  consistency  proved  by  mathematics  and  statistics;  b)  black  box 
validation  by  comparing  its  outputs  in  duration  and  effort  with  other  available  models; 
and  c)  black  box  validation  against  simulations  conducted  with  ViteProject. 


G.  ORGANIZATION  OF  THESIS 

This  thesis  is  organized  in  six  chapters.  The  introduction  has  been  covered  in  the 
present  chapter.  Chapter  II  presents  the  theoretical  foundation  and  background  on 
software  engineering,  software  evolution  and  risk  management.  The  conceptual 
framework  of  the  model  is  developed  on  Chapter  III.  Chapters  IV  and  V  present  the 
detailed  research  design  and  findings  respectively.  Finally,  in  Chapter  VI  discusses  the 
conclusions  and  future  research. 
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II.  THEORETICAL  FOUNDATION 


A.  THEORETICAL  FOUNDATION  FOR  SOFTWARE  EVOLUTION 

1.  The  Graph  Model 

The  graph  model  is  a  data  graph  model  for  evolution  that  records  dependencies 
and  supports  automatic  project  planning,  scheduling,  and  configuration  management.  The 
evolution  process  is  represented  by  a  graph  that  at  any  given  moment  models  the  current 
and  the  past  state  of  the  software  system. 

Evolutionary  prototyping  offers  an  iterative  approach  to  requirement  engineering 
to  alleviate  the  problems  of  uncertainty,  ambiguity  and  inconsistency  inherent  in  the 
process.  Moreover,  prototyping  can  improve  the  capture  of  change  in  requirements  and 
assumptions  during  the  development  process.  This  effect  is  particularly  notorious  in 
projects  involving  multiple  stakeholders  with  different  points  of  view. 

Computer  Aided  Prototyping  System  (CAPS)  is  a  CASE  tool  that  provides  a 
collection  of  techniques  and  languages  for  computer-aided  prototyping,  including  logical 
assessment  of  the  consistency  and  clarity  of  requirements  and  specifications.  CAPS 
methods  involve  the  use  of  real-time  constraints  and  abstract  modeling  to  describe  the 
requirements  in  a  clear,  precise,  consistent  and  executable  format.  Prototypes  can  be 
applied  to  demonstrate  system  scenarios  to  the  affected  parties  as  a  way  to;  a)  collect 
criticisms  and  feedback  that  are  sources  for  new  requirements;  b)  early  detection  of 
deviations  from  users'  expectations;  c)  trace  the  evolution  of  the  requirements;  and  d) 
improve  the  communication  and  integration  of  the  users  and  the  development  personnel. 
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Real  time  systems  present  special  difficulties  in  terms  of  requirement  engineering. 
Some  requirements  are  difficult  for  the  user  to  provide  and  for  the  analysts  difficult  to 
determine.  The  best  way  to  discover  these  hidden  requirements  is  via  prototyping.  CAPS 
is  a  tool  specially  suited  for  this  task.  It  has  a  graphical,  easy  to  understand,  interface  that 
maps  to  a  specification  language,  which  in  turns  generates  Ada  code.  The  main 
components  of  CAPS  are: 

(a)  The  prototype  system  description  language  (PSDL). 

(b)  User  interface  based  on  a  graphic  editor  with  a  palette  of  objects  that  include 
operators,  inputs,  outputs,  data  flows  and  operator  loops.  A  search  engine  helps  the 
designer  to  find  reusable  components. 

(c)  The  software  database  system  provides  a  repository  for  reusable  PSDL  components. 

(d)  The  execution  support  system  consists  of  a  translator,  scheduling  mechanisms, 
execution  monitors,  and  a  debugger. 

The  prototyping  process  consists  of  prototype  construction  and  modification 
(evolution)  based  on  evolving  requirements  and  code  generation.  Both  construction  and 
modification  are  exploratory  activities  with  a  common  target:  to  satisfy  multiple  users 
with  different  and  often  conflicting  points  of  view.  Requirement  engineering  is  a 
consensus  driven  activity  in  which  mechanisms  for  conflict  resolution  and  traceability  of 
requirement  evolution  represent  critical  success  factors. 

PSDL  is  based  on  data  flow  under  real-time  constraints  and  uses  an  enhanced  data 
flow  diagram  that  includes  non-procedural  control  and  timing  constraints.  PSDL  serves 
as  an  executable  prototyping  language  at  a  specification  or  design  level.  The  user 
interface  contains  a  graphic  editor,  a  browser  to  view  reusable  components,  and  an  expert 
system  that  provides  the  capability  to  generate  English  text  descriptions  of  PSDL 
specifications. 
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The  software  database  system  provides  the  repository  facilities  for  reusable 
components,  as  well  as  control  of  versions.  The  execution  support  system  consists  of  a 
translator  that  generates  code  that  binds  the  reusable  components,  scheduling 
mechanisms,  and  a  debugger. 

The  model  views  a  software  evolution  process  as  a  partially  ordered  set  of  steps. 
Steps  represent  activities  required  to  produce  the  system.  A  step  has  states  that  reflect  the 
dynamic  progression  of  the  activity  from  the  moment  that  it  is  proposed  to  the  moment  it 
is  completed  or  abandoned. 

The  graph  model  has  experienced  its  own  evolution  process.  (Luqi,  1989) 
introduced  the  primitive  version  of  the  model.  (Mostov,  1989),  (Mostov,  1990)  and  (Luqi, 
1990)  refined  and  elaborated  the  model.  In  (Luqi,  1990),  the  notion  of  hypergraph  was 
introduced  to  realize  automated  software  evolution  in  multidimensional  phases.  Further 
refinements  including  scheduling  and  team  coordination,  were  introduced  by  (Badr, 
1993).  Conflict  resolution  of  requirements  and  criticisms  introduced  by  (Ramesh,  1992) 
and  (Ibrahim,  1996).  (Luqi,  1997)  extended  the  graph  model  to  a  hypergraph  that 
improved  the  traceability  of  dependencies  and  introduced  the  concept  of  hyper¬ 
requirements.  Finally,  Ham  extended  the  model  to  a  relational  hypergraph  model  (Ham, 
1998a,  Ham,  1998b,  Ham,  1998c). 


2.  Conflict  Resolution  Model 

Evolutionary  software  development  requires  a  way  to  solve  the  conflicts  that 
could  occur  between  various  users'  points  of  view.  System  design  must  follow  a 
deliberation  process  that  involves  the  resolution  of  issues  or  concerns  that  must  be 
addressed  to  satisfy  user  requirements.  (Conklin,  1988)  introduced  IBIS  model  and 
(Ramesh,  1992)  extended  it  addressing  the  following  concepts: 
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(1)  Requirements  represent  the  goals  to  be  satisfied 
by  the  design  process. 

(2)  Issues  are  questions  or  concerns  that  different 
stakeholders  introduce. 

(3)  Positions  are  alternatives  that  address  an  issue. 

(4)  Arguments  either  support  or  object  a  position. 

(5)  Decisions  represent  the  resolution  of  issues  and 
lead  to  constraints. 


Figure  2.1:  Ramesh's  model 
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3.  Relational  Hypergraph  Model 

The  relational  hypergraph  model  was  introduced  in  (Ham,  1999e)  is  a  formal 
model  for  software  evolution  that  incorporates  the  features  of  the  previous  graph  models. 
The  hypergraph  model  (Luqi,  1997)  represents  the  evolution  history,  as  well  as  the  plan 
for  the  future,  in  a  hypergraph.  A  hypergraph  is  a  directed  graph  with  hyperedges,  which 
may  have  multiple  input  and  output  nodes.  The  formal  definition  of  the  relational 
hypergraph  model  is  presented  on  Appendix  A. 


4.  Conclusions  about  the  Relational  Hypergraph  Model 

The  precedent  definitions  constitute  the  formal  specification  of  the  relational 
hypergraph  model.  It  constitutes  a  framework  to  support  software  evolution  processes. 
However,  risk  assessment  has  been  omitted  as  part  of  the  specification.  This  issue  creates 
a  human  dependency  in  risk  assessment.  Despite  this  limitation,  the  model  can  be 
extended  to  support  automated  risk  assessment. 
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B.  THEORETICAL  FOUNDATION  FOR  RISK  MANAGEMENT 


1.  Risk  and  Uncertainty 

Developing  software  is  still  a  high-risk  activity.  Despite  the  advances  in 
technology  and  CASE  tools,  little  progress  has  been  done  in  improving  the  management 
of  software  development  projects.  The  acquisition  and  development  communities,  both 
governmental  and  industrial,  lack  of  systematic  way  of  identifying,  communicating  and 
resolving  technical  uncertainty  (SEI,  1996).  Research  shows  that  45  percent  of  all  the 
causes  for  delayed  software  deliveries  are  related  to  organization  issues  (vanGenuchten, 
1991).  Software  is  also  the  main  cost  contribution  factor  in  computer  systems  (Boehm, 
1981),  (Karolak,  1996).  Besides  the  improvements  in  tools  and  methodologies,  there  is 
not  evidence  of  success  in  moving  from  the  idea  to  the  product.  A  study  published  by  the 
Stadish  Group  reveals  that  the  number  of  software  projects  that  fail  has  dropped  from 
40%  in  1997  to  26%.  However,  the  percentage  of  projects  with  costs  and  schedule 
overruns  grow  up  from  33%  in  1997  to  46%  (Reel,  1999). 

Part  of  the  problem  is  the  misinterpretation  the  importance  of  risk  management.  It 
is  usually  viewed  as  an  extra  activity  layered  on  the  assigned  work,  or  worst,  as  an 
outside  activity  that  is  not  part  of  the  software  process  (Hall,  1997),  (Karolak,  1996). 

A  second  source  of  the  problem  is  the  lack  of  tools  needed  to  perform  risk 
management  (Karolak,  1996).  The  main  reason  for  this  lack  of  tools  is  that  risk 
assessment  is  apparently  an  unstructured  problem.  To  define  unstructured  problems  it  is 
necessary  to  explain  previously  structured  processes.  Structured  processes  involve  routine 
and  repetitive  problems  for  which  a  least  one  solution  exists.  Unstructured  processes 
require  decision-making  based  on  a  three-phase  method  (intelligence,  design,  choice) 
(Turban  &  Aronson,  1 998).  An  unstructured  problem  is  one  in  which  none  of  the  three 
phases  is  structured.  Risk  management  is  highly  biased  by  manager’s  perceptions  and 
characteristics  that  are  difficult  to  represent  in  an  algorithm.  Depending  on  the  decision- 
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maker's  risk  behavior,  he/she  can  opt  to  choose  early  with  lack  of  information,  or  to 
postpone  the  decision  gaining  time  to  invest  in  obtaining  information,  but  loosing 
opportunity  control. 

The  third  source  of  the  problem  is  the  confusion  created  by  the  informal  use  of 
terms.  Often,  the  software  engineering  community  (and  most  part  of  the  project 
management  community  (Wideman,  1992)),  use  the  term  "risk"  in  a  very  lax  sense. 
Generally,  software  risk  is  viewed  as  a  measure  of  the  likelihood  of  a  loss  or  an 
unsatisfactory  outcome  affecting  the  software  from  different  points  of  view:  project, 
process  and  product  (Hall,  1997),  (SEI,  1996).  This  definition  of  risk  is  misleading 
because  confounds  the  concepts  of  risk  and  uncertainty.  In  general,  most  part  of  the 
decision  making  during  the  software  process  is  under  uncertainty  rather  than  under  risk. 
Let  discuss  briefly  the  decision-making  environments  in  order  to  clarify  these  concepts. 

There  are  three  possible  situations  in  any  decision  context:  certainty,  risk  and 
uncertainty.  Decisions  under  certainty  occur  when  the  decision-maker  knows  exactly  the 
consequence  of  each  alternative  or  decision  choice.  In  this  case  the  decision  process  is 
very  simple:  the  alternative  with  the  best  outcome  is  chosen  by  examination.  However, 
this  is  a  rare  situation. 

Usually  the  decision-maker  does  not  have  a  complete  picture  of  the  future,  but 
knows  the  probability  of  occurrence  of  the  various  states  of  nature.  In  this  case  the 
decision-making  is  under  risk,  and  many  techniques  can  be  addressed  to  support  the 
decision:  expected  monetary  value,  expected  value  of  perfect  information,  opportunity 
loss,  sensitivity  analysis  among  others  (Render,  1997).  All  these  methods  rely  on  the 
huge  hypothesis  of  knowing  the  exact  probability  for  each  scenario. 

A  completely  different  situation  is  when  the  decision-maker  does  not  have  the 
precise  information  about  the  probabilities  of  occurrence  of  the  different  states  of  nature, 
or  the  list  of  the  states.  In  this  case  it  is  impossible  to  assess  the  outcome,  hence  a 
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completely  different  set  of  techniques  must  be  applied  to  support  the  decision-making 
process:  maximin,  minimax,  Laplace,  Hurwicz,  or  minimax  regret  (Render,  1997). 

The  distinction  between  these  two  concepts  is  important  for  decision  making 
because  it  leads  to  drastically  different  approaches  to  risk  assessment: 

(a)  Assessing  software  risk  using  a  probabilistic  approach  usually  measuring  reliability. 
In  this  case  the  decision-making  is  under  risk.  However,  even  using  probabilistic 
models  there  exist  a  component  of  imcertainty  created  by  uncertainties  in  parameter 
values,  uncertainties  in  modeling,  and  ambiguities  in  the  degree  of  completeness 
(Baybutt,  1989). 

•  Ambiguities  in  parameter  values  are  consequence  of  the  need  to  estimate 
parameter  values  from  data.  The  ambiguities  arise  because  the  available  data  is 
usually  incomplete  and  because  the  analyst  makes  inferences  from  a  state  of 
incomplete  knowledge. 

•  Deficiencies  of  model  in  representing  the  reality. 

•  Completeness  ambiguities  are  introduced  by  the  inability  of  the  analyst  in 
evaluating  exhaustively  all  contributions  to  risk. 

The  treatment  of  uncertainties  in  risk  analysis  involves  the  evaluation  of  uncertainties 
in  the  input,  the  propagation  through  each  part  of  the  risk  analysis,  the  combination  of 
the  uncertainties  in  the  output,  the  display  and  interpretation  of  risk  estimates,  and  the 
treatment  of  uncertainties  in  decision-making. 

(b)  Assessing  software  risk  using  a  framework  of  practices  and  guidelines  (SEI,  1996).  In 
this  case  there  is  not  a  probabilistic  model  to  rely  on,  hence  the  decision-making  is 
under  uncertainty. 

It  follows,  as  it  was  previously  stated,  that  the  largest  part  of  the  decisions  made 
by  software-managers  are  under  uncertainty.  Two  categories  of  research  attacked  the 
issue  from  different  angles.  First,  probabilistic  approaches  have  been  made  with  success 
to  assess  the  reliability  of  the  product  (Lyu,  1995),  (Schneidewind,  1975),  (Musa,  1998). 
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However,  these  approaches  assess  software  reliability  when  it  is  too  late  for  software 
engineering  purposes,  because  the  product  is  complete  or  almost  complete. 

A  second  category  of  research  has  addressed  the  problem  from  a  different 
perspective,  trying  to  assess  the  risk  in  parallel  with  the  development  process.  However, 
in  this  case  the  approach  is  less  rigorous  and  unstructured,  basically  the  proposals  are  lists 
of  practices  and  checklists  (SEI,  1996),  (Hall,  1997)  or  scoring  techniques  (Karolak, 
1996).  Paradoxically,  SEI  defines  software  technical  risk  as  a  measure  of  the  probability 
and  severity  of  adverse  effects  in  the  development  of  software  that  does  not  meet  its 
intended  functions  and  performance  requirements  (SEI,  1996).  However,  the  term 
"probability"  in  this  case  is  misleading,  because  the  probability  is  unknown.  There  is  a 
third  category  of  research  focused  mainly  on  estimation  of  effort  and  time  that  has 
characteristics  of  both  previous  groups.  This  approach  tangentially  related  to  risk  and  will 
be  discussed  in  Section  E. 

The  fourth  source  of  confusion  is  introduced  when  the  term  "risk"  is  used  to 
describe  different  things.  It  is  not  only  erroneously  used  as  a  synonym  of  uncertainty  as 
stated  before,  but  it  is  also  used  as  a  synonym  of  "threat"  (SEI,  1996),  (Hall,  1997), 
(Karolak,  1996).  In  this  research  the  term  risk  is  reserved  to  indicate  the  probabilistic 
outcome  of  a  succession  of  states  of  nature,  and  the  term  "threat"  is  used  to  identify  the 
dangers  that  can  occur. 


2.  Decision  under  Uncertainty 

Very  frequently  decision-makers  make  decisions  using  incomplete  information. 
Particularly,  the  problem  of  decision-making  under  uncertainty  involves  choosing  among 
a  set  of  alternatives  under  the  following  conditions: 

•  The  outcome  of  each  course  of  action  depends  on  several  possible  states  of 
nature. 

•  The  outcome  for  each  alternative  under  each  state  of  nature  is  known. 
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The  probability  of  occurrence  of  each  state  of  nature  is  unknown. 


When  the  probability  of  occurrence  of  each  state  of  nature  is  unknown  or  cannot 
be  assessed,  then  the  following  five  techniques  can  be  applied; 

•  Maximax  criterion.  This  criterion  implies  an  optimistic  vision  of  the  future. 
The  method  consists  in  choosing  the  alternative  that  maximizes  the  maximum 
outcome  for  every  alternative. 

•  Maximin  criterion.  This  method  finds  the  alternative  that  maximizes  the 
minimum  outcome.  It  is  a  pessimistic  approach. 

•  Laplace  criterion.  This  method  uses  equal  probabilities  for  each  state  of  nature 
and  then  computes  the  outcomes  for  each  alternative,  choosing  the  higher 
outcome. 

•  Criterion  of  realism.  This  method  is  also  known  as  Hurwicz  criterion.  It  is  a 
compromise  between  an  optimistic  and  a  pessimistic  decision.  The  decision¬ 
maker  must  choose  a  coefficient  of  realism  a  between  0  and  1.  This 
coefficient  is  applied  to  the  favorable  state  of  nature  outcome,  and  (1  -  a)  is 
applied  to  the  outcome  of  the  unfavorable  state  of  nature.  The  alternative  with 
the  higher  weighted  sum  is  chosen. 

•  Minimax  criterion.  This  method  is  based  on  opportunity  loss.  It  finds  the 
alternative  that  minimizes  the  maximum  opportunity  loss  within  the 
alternatives. 

3.  Subjective  Probabilities  and  Utility  Theory 

Another  way  to  deal  with  uncertainty  situations  is  to  use  a  subjective  estimation  of 
the  probabilities  of  occurrence  of  the  different  states  of  nature.  This  approach  is  easy  to 
implement  but  requires  a  great  deal  of  experience  to  judge  the  success  probability  of  each 
alternative.  Group  consensus  techniques,  like  Delphi  method  (Dalkey  &  Helmer,  1963), 
are  usually  very  helpful  in  such  situations  (Marshall,  1995). 
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Decisions  trees  are  based  on  the  expected  monetary  value  (EMV)  could  lead  to 

bad  decisions  in  many  cases.  There  are  many  situations  in  which  a  linear  payoff  function 

is  unable  to  represent  the  behavior  of  people  (Marshall,  1995).  These  are  the  two  reasons 

to  study  utility  theory.  In  practice,  historical  data  can  be  analyzed  to  obtain  an  objective 

estimate  of  the  outcomes.  But  in  situations,  especially  those  that  incorporate  management 

decisions,  historical  data  could  be  not  relevant.  The  judgments  and  beliefs  of  the 

decision-makers  may  be  more  important  that  estimating  relevant  probabilities  (Marshall, 

1995).  Before  describing  utility  theory  in  detail,  two  definitions  are  required: 

"The  indifference  probability  for  a  decision  problem  between  a  risky 
venture  and  a  riskless  alternative  with  given  known  results  is  that 
probability  of  success  in  the  risky  venture  for  which  the  decision-maker  is 
indifferent  to  the  two  alternatives."  (Marshall,  1995). 

"The  certainty  equivalent  to  a  risky  venture  is  the  least  amount  the 
decision-maker  would  have  to  obtain  for  certain  by  choosing  the  riskless 
alternative."  (Marshall,  1995). 

In  many  situations  the  indifference  probability  and  the  certainty  equivalent  would 
have  different  values  for  different  people.  The  differences  reflect  various  behaviors 
toward  risk.  Utility  assessment  assigns  the  worst  outcome  a  utility  of  0  and  the  best 
outcome  a  utility  of  1 .  All  other  outcomes  have  a  utility  value  between  0  and  1 .  When 
two  or  more  alternatives  are  equally  attractive  (or  imattractive),  that  is  the  decision-maker 
is  indifferent,  then  their  utility  value  should  be  the  same.  The  problem  is  to  find  the 
probability  that  makes  the  decision-maker  indifferent. 

Until  now,  it  was  considered  decision-making  with  only  one  attribute.  A  more 
general  scenario  would  have  many  attributes  for  measuring  the  decision.  Often,  these 
attributes  conflict  with  each  other,  hence  optimizing  one  results  in  suboptimizing  others. 
Thus,  it  is  necessary  to  use  trade-offs  to  resolve  such  conflicts.  A  common  approach  to 
solving  multiattribute  problems  is  to  combine  the  different  measures  into  a  single 
numeric  measure.  The  problem  can  then  be  treated  as  single  attribute  problem  (Marshall, 
1995).  In  many  decision  problems  it  is  very  difficult  to  establish  measurement  criteria. 
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Particularly,  when  the  decisions  are  not  at  the  operational  level.  At  the  operational  level, 
decisions  can  be  measured  in  terms  of  lines  of  code  or  function  points.  However,  at  the 
project  management  level,  the  effectiveness  of  a  decision  could  be  measured  in  terms  of 
quality,  stability,  marketing  impact,  etc.  In  such  cases,  multiattribute  utility  theory  should 
be  applied  (Fig.  2.2). 

OUTCOMES 
A1  A2  An 


Figure  2.2:  Multiattribute  Decision  Tree 


The  decision-maker  must  provide  his  estimation  of  return  for  each  attribute 
related  to  the  decision,  as  a  vector  R  =  (Rl,  R2,  ...,  Rn).  The  decision-maker  must 
introduce  also  his  preferences  as  a  weight  vector  W  =  (Wl,  W2, ...,  Wn).  The  outcomes 
of  each  attribute  are  given  by  Ai,  such: 

Ai  =  Wi  *  Ri 

n 

where  2  Wi  =  1 

i  =  0 

The  outcome  for  each  alternative  is  then  calculated  as  a  ftmction  of  the  sum  of  the 
attributes  (Al,  A2, ...,  An)  converted  to  a  value  between  0  and  1,  where  1  is  given  to  the 
best  outcome  and  0  to  the  worst. 
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C.  SOFTWARE  ENGINEERING  FOUNDATIONS 

The  literature  and  research  about  risk  and  risk  management  is  very  wide.  This 
research  focuses  on  a  partition  that  comprehends  operational  research,  project 
management,  software  engineering  and  software  reliability.  Operational  research  provides 
the  theoretical  foundation  to  describe  and  analyze  risk.  Project  management,  software 
engineering  and  software  reliability  apply  the  theory.  This  research  narrows  the  problem 
to  software,  specifically  to  the  software  engineering  domain. 

Taxonomies  are  very  useful.  They  facilitate  the  understanding  of  complexity  by 
partitioning  the  problem  in  disjoint  pieces  that  are  simpler.  The  review  of  the  literature 
shows  two  different  schools  of  thinking: 

(1)  The  group  that  studies  the  problem  of  software  risk  from  the  point  of  view 
of  the  development  process.  This  group  follows  a  forward  approach 
managing  the  risk  in  parallel  with  the  development  process.  The  caveat  of 
many  of  these  approaches  is  that  they  are  not  formal  and  their  success 
mainly  depends  on  human  expertise. 

(2)  The  group  that  studies  the  problem  of  software  risk  from  the  point  of  view 
of  reliability.  This  group  follows  post  mortem  approach,  studying  the 
product  created  and  inferring  its  future  behavior.  This  category  is  strongly 
supported  by  statistics.  However,  from  the  point  of  view  of  software 
engineering,  it  has  less  impact  because  the  findings  arrive  too  late  to  make 
changes  in  the  product  without  incurring  in  huge  costs. 

1.  Software  Engineering  Institute  (SEI) 

The  Software  Engineering  Institute  (SEI),  at  Carnegie  Mellon,  relies  on 
improving  the  process  as  a  way  to  improve  the  products  and  diminish  risks.  This 
philosophy  is  particularly  clear  in  a  guideline  created  as  a  request  of  the  USAF  by  SEI 
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and  Mitre  Corporation  (Humphrey,  1987).  The  document  describes  a  method  to  assess  the 
software  engineering  capabilities  of  contractors.  The  guideline  stated  that  the  quality  of 
the  product  depends  on  the  quality  of  the  process,  which  depends  on  the  technology  used 
to  support  it,  which  depends  on  the  maturity  level  of  the  organization.  Hence,  by 
transitivity,  the  quality  of  the  product  depends  on  the  maturity  level  of  the  organization. 
Consequently,  assessing  the  maturity  of  the  organization  it  is  possible  to  estimate  the 
attributes  of  the  product 

The  SEI  proposes  a  three  dimensional  vision  of  risk  management  process 
composed  by  (SEI,  1996): 

(a)  Temporal  dimension  that  includes  the  micro  perspective,  that  is  from  the 
point  of  view  of  the  project,  and  the  macro  perspective  that  covers  the 
complete  life  cycle. 

(b)  Methodological  dimension  that  includes  practices  (software  risk 
evaluation  (SRE),  continuous  risk  management  (CRM)  and  team  risk 
management  (TRM)),  and  basic  constructs  including  the  SEI's  risk 
taxonomy. 

(c)  Human  dimension  that  consider  the  perspectives  of  the  individual,  the 
team,  the  management  and  the  stakeholder. 

The  SEI  approach  to  risk  assessment  uses  a  risk  taxonomy  questionnaire  to  ensure 
that  all  risk  areas  are  systematically  addressed.  The  complete  taxonomy  can  be  reached 
on  (SEI96).  Table  2.1  presents  a  brief  summary  to  show  the  characteristics  analyzed. 
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Table  2.1:  SEI's  taxonomy  of  risks  (SEI,  1996) 


1.  Product  engineering 

1. 1.  Requirements  (stability,  completeness,  clarity,  validity,  feasibility,  precedent,  and  scale). 

1.2.  Design  (functionality,  interfaces,  performance,  testability,  hardware  constraints,  and 
non-developmental  software). 

1.3.  Code  and  unit  test  (feasibility,  testing,  coding/implementation). 

1.4.  Integration  and  test  (environment,  product,  system). 

1.5.  Engineering  specialties  (maintainability,  reliability,  safety,  security,  human  factors,  and 
specifications). 

2.  Development  environment 

2. 1.  Development  process  (formality,  suitability,  process  control,  familiarity,  and  product 
control). 

2.2.  Development  system  (capacity,  suitability,  usability,  familiarity,  reliability,  system 
support,  and  deliverability). 

2.3.  Management  process  (planning,  project  organization,  management  experience,  program 
interfaces). 

2.4.  Management  methods  (monitoring,  personnel  management,  quality  assurance,  and 
configuration  management). 

2.5.  Work  environment  (quality  attitude,  cooperation,  communication,  and  morale). 

3.  Program  constraints 

3. 1.  Resources  (schedule,  staff,  budget,  and  facilities). 

3.2.  Contract  (type  of  contract,  restrictions,  and  dependencies). 

3.3.  Program  interfaces  (customer,  associate  contractors,  subcontractors,  prime  contractor, 
_ corporate  management,  vendors,  and  politics). 


The  SEI  approach  presents  some  problems: 

•  Many  of  the  items  covered  by  this  taxonomy  are  highly  subjective  and 
difficult  to  express  in  terms  of  equations.  How  to  measure  politics?  How  to 
measure  with  confidence  the  morale?  The  only  way  is  to  use  qualitative 
measures  that  have  inherent  subjectivity. 

•  Many  of  the  items  are  covered  more  than  once.  As  instance  human  factors, 
work  environment  and  budget  seem  to  be  highly  related. 

•  The  guidelines  are  sets  of  heuristics  and  good  practices  which  impact,  on  the 
success  of  the  project,  depends  on  human  experience. 


Consequently,  this  approach  relies  on  the  ability  of  the  human  using  the  checklist. 
It  is  required  an  expert  to  assess  the  risk. 
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2.  Hall 

Elaine  Hall's  method  for  managing  risk  (Hall,  1997)  is  derived  from  the  SEI 
model.  In  her  view  four  major  critical  success  factors  are  responsible  for  risk 
management:  People,  Process,  Infrastructure,  and  Implementation  (P"I'). 

•  People  participate  in  risk  management  by  implementing  the  processes 
according  to  the  plans,  by  detecting  problems,  communicating  issues  and 
introducing  uncertainties  in  their  work.  People  at  all  levels  need  to  be 
educated,  involved,  and  motivated  in  risk  management. 

•  Process  must  transform  uncertainties  into  risks.  The  transformation  is  based 
on  identifying  the  sources  of  risk,  analyzing  the  risk  based  on  some 
established  criteria,  planning  alternative  strategies  for  risk  resolution,  tracking 
the  risk  metrics,  and  resolving  the  risk  triggering  action  plans.  Unfortunately, 
how  to  do  the  transformation  (that  is  the  key  problem),  is  not  addressed  in 
(Hall,  1997)  nor  in  (SEI,  1996). 

•  Infrastructure  establishes  the  culture  that  supports  risk  management. 

•  Implementation  is  the  execution  of  the  plans,  assigning  responsibilities, 
authorities,  tools  and  methods. 

On  Hall's  method,  checklists  based  on  SEI  taxonomy,  work  breakdown 
decomposition,  meetings,  reviews,  and  surveys  are  the  tools  for  risk  identification.  All 
these  tools  are  human  dependant  and  highly  unstructured.  Hence,  the  method  is  very 
difficult  to  automate.  However,  Hall  emphasizes  the  use  of  metrics  to  identify  occurrence 
of  risks  such  as  progress  in  milestones,  size  (LOG),  change  (requirements  added, 
changed,  deleted),  quality  (number  of  defects),  staff  (turnover)  and  risk  exposure.  Risk 
analysis,  risk  planning,  risk  tracking  and  risk  resolution  are  based  on  planning,  and  a  set 
of  resolution  techniques  and  tools  inherited  from  SEI's  model.  Hall's  approach  has  the 
same  problems  of  SEI's  model. 
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3.  Charette 

(Charette,  1997)  introduced  the  concept  of  risk  management  in  maintenance.  The 
author  states  that  during  maintenance,  risk  management  is  more  difficult  than  during 
development.  First,  maintenance  projects  provide  more  opportunities  for  risk  and  less 
freedom  to  mitigate  it  as  a  consequence  of  the  previous  version  of  the  system.  Second,  it 
involves  more  attention  to  customer  related  issues.  The  approach  is  based  on  uses  SEI's 
taxonomy  as  the  tool  to  identify  treats  and  SEI's  software  risk  evaluation  process  to  assess 
the  risk.  Charette's  approach  has  the  same  problems  that  previously  addressed  about  SEI’s 
model.  The  method  relies  on  human  experience. 


4.  Jones 

During  the  60’s  and  the  70’s,  IBM  have  focused  significantly  on  software 
processes.  Many  technologies  were  invented  in  IBM’s  laboratories:  HlPO  diagrams,  joint 
application  design,  formal  inspections,  structured  walkthroughs,  integrated  cost  and 
estimation  tools,  and  formal  specifications.  It  is  significant  also  that  CMM  has 
characteristics  that  can  be  traced  back  to  IBM  when  Humphrey  was  at  IBM.  Neither 
SETs  CMM  or  Software  Productivity  Research  (SPR)  (Jones,  1994)  addresses  how  to 
solve  the  problems  of  estimation.  SPR  is  a  software  process  introduced  by  Capers  Jones 
that  has  some  very  similar  characteristics  with  CMM.  Jones  and  Humphrey  were  working 
at  IBM  during  the  seventies,  so  it  is  not  surprising  that  both  models  have  common 
characteristics.  As  an  example  the  five-level  scale  of  CMM  correspond  to  the  five-scale 
of  SPR.  (Jones,  1994)  observed  those  significant  risks  are  not  the  same  across  all 
software  domains.  He  introduced  six  categories  of  software  projects  with  different  kinds 
of  risks.  Table  2.2  shows  the  percentage  of  projects  at  risk  for  each  category.  Note  that 
the  table  is  ordered  showing  on  the  top  the  risk  factors  more  common  for  all  the  projects 
categories. 
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Table  2.2:  Jone’s  top  risk  factors  (Jones.  1994) 
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■■/'''  ''?y'  ''  i 

Oancellation 
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30%' : 
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20% 

Jones  stated  that  the  ten  most  serious  risk  factors  observed  in  the  SPR  assessments 
are: 

(1)  Inaccinate  metrics.  The  generalized  use  of  LOC  as  a  productivity  metric 
introduces  errors  because  the  differences  in  the  languages  and 
programming  styles.  Counting  LOC  does  not  address  the  complexity 
involved  in  recursion  nor  object-oriented  paradigm.  LOC  is  very  difficult 
to  estimate  during  the  requirements.  Albrecht  addressed  this  problem  with 
the  introduction  of  function  points.  However,  recently  Kitchenham, 
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Kemerer  and  others  have  introduced  some  criticisms  to  this  metric.  This 
issue  will  be  discussed  on  Chapter  III. 

(2)  Inadequate  measurement.  Data  collection  is  not  always  correctly  done, 
even  in  the  case  of  cost  collection.  One  major  leak  in  terms  of  cost  is  the 
work  of  end  users. 

(3)  Time  pressure  introduced  by  irrational  schedules  or  by  continuously 
changing  requirements.  This  second  factor  is  more  intense  as  the 
complexity  of  the  systems  grows.  Projects  with  more  than  1000  function 
points  are  most  likely  to  experience  this  problem. 

(4)  Management  weaknesses  due  to  lack  of  education  in  estimation,  planning, 
measuring  and  assessment. 

(5)  Inaccuracies  in  cost  estimation.  Despite  the  numerous  commercial 
software  tools  available,  the  use  of  estimation  tools  is  not  generalized. 

(6)  Naive  belief  that  moving  to  a  new  technology  will  create  improvements  in 
productivity  or  quality. 

(7)  Late  requirements.  Even  with  the  availability  methodologies  like 
prototyping,  JAD  or  QFD,  and  metrics  like  function  points  or  feature 
points,  which  permit  to  understand  the  impact  of  changes,  late 
requirements  continue  to  be  a  major  threat. 

(8)  Low  quality.  The  current  average  of  defects  per  function  point  in  U.S.  is  5 
defects  per  function  point. 

(9)  Low  productivity.  The  current  U.S.  average  for  military  projects  is  about  3 
function  points  per  man-month.  For  MIS  the  productivity  is  about  8 
function  points  per  man-month. 

(10)  Cancellation  of  projects  is  directly  proportional  to  their  size.  This 
particularly  critical  above  10,000  function  points  or  1  million  LOC. 

The  contribution  of  Jones  reveals  some  common  threats  characteristics  of 
different  types  of  software  projects.  It  is  particularly  significant  the  impact  of  paperwork 
and  low  productivity  in  DoD  projects.  The  caveat  of  this  work  is  that  it  does  not  provide  a 
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method  to  manage  risk  relying  on  the  experience  of  the  project  manager  to  make  the  right 
decisions. 


5.  Karolak 

(Karolak,  1996)  introduced  a  classification  scheme  that  divides  the  risk  in  three 
softw^are-risk  elements:  Technical,  Cost  and  Schedule.  This  model  uses  subjective 
Bayesian  probability  approach  to  assess  software  risks.  Each  of  the  three  software  risk 
elements  are  influenced  by  ten  risk  factors  according  with  Table  2.3: 


Table  2.3:  Karolak’s  scheme  (Karolak,  1996) 
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(a)  "Organization”  addresses  risks  associated  with  the  maturity  of  the  organization  shucture, 
ftjnctions,  management  and  communications. 

(b)  "Estimation”  addresses  the  risks  associated  with  inaccuracies  in  estimating  resources, 
schedules  and  costs. 

(c)  "Monitoring"  refers  to  risks  associated  with  identifying  problems. 

(d)  "Methodology"  addresses  the  risks  associated  with  the  lack  of  formal  methodology  and 
standards. 

(e)  ’Tools”  refers  to  the  risks  associated  with  the  development  tools. 

(f)  "Risk  culture"  addresses  the  characteristics  of  the  management  decision-making  style. 

(g)  "Usability”  refers  to  risks  associated  to  the  software  product  after  it  is  delivered. 

(h)  "Correctness"  addresses  to  the  risks  associated  with  compliance  with  requirements  after  the 
delivery. 
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(i)  "Reliability"  refers  to  the  risks  of  failures  after  the  delivery. 

(j)  "Personnel"  includes  the  risks  associated  with  the  knowledge  and  skills  of  the  development 
team. 

The  key  element  to  identify  and  measure  risks  on  Karolak's  approach  is  a 
questionnaire  used  to  evaluate  the  risk  factors  (81  questions:  organization  8,  estimation  7, 
monitoring  7,  methodology  7,  tools  9,  risk  culture  11,  usability  6,  correctness  9, 
reliability  12,  and  personnel  12).  The  answer  for  each  question  in  a  number  between  0 
and  1,  where  0  represents  none  and  1  represents  all.  The  main  contribution  of  this  model 
is  that  it  can  be  automated;  indeed  Karolak  developed  a  tool  called  SERIM  (Software 
Engineering  Risk  Model).  However,  the  problem  with  this  approach  is  that  even  though 
the  tool  provides  support,  human  experience  is  still  required  as  the  key  factor  to  identify 
risks. 


6.  Project  Management  Institute  (PMI) 

The  Project  Management  institute  (PMI)  introduced  a  methodology  for  risk 
management  (Wideman,  1992)  generalized  for  any  kind  of  projects.  The  method  is  based 
on  four  phases:  risk  identification,  risk  assessment,  risk  response,  and  documentation. 
Risk  identification  follows  an  informal  approach  based  on  taxonomies,  expert's  opinions 
and  workgroup  techniques.  The  assessment  phase  may  range  from  subjective  evaluation 
to  the  use  of  metrics.  This  phase  includes  also  the  analysis  of  impact.  On  this  model  there 
are  two  planning  activities:  response  planning,  and  contingency  planning;  and  three' 
typical  risk  response  strategies:  avoidance,  deflection,  and  absorption.  PMI  uses  the  term 
risk  to  denote  two  different  concepts:  the  probability  of  occurrence  of  a  threat  and  the 
threat  itself.  Another  terminology  issue  in  this  approach  is  the  use  of  the  term  risk  in 
scenarios  which  decisions  are  made  under  uncertainty  rather  than  risk.  The  approach  is 
too  generaTto  be  useful  in  software  engineering. 
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7.  Mitre  Corporation 

Mitre  Corporation  developed  a  Web  application  (RAMP)  to  capture  risk 
management  experience  and  retrieve  experiences  from  other  projects  and  advice.  The  user 
introduces  the  characteristics  of  his  project  in  a  static  HTML  form.  A  query  is  launched 
over  the  RAMP  databases  creating  a  dynamic  HTML  form  with  a  set  of  projects  with 
similar  characteristics.  The  user  can  select  one  or  more  of  these  projects  and  a  second 
script  retrieves  risks  from  the  database.  The  result  of  this  second  query  is  a  report 
containing  links  to  the  applicable  documents  (Garvey,  1997).  This  approach  helps  the 
decision-maker  providing  him  of  related  documents  about  similar  projects,  but  it  did  not 
release  the  need  of  human  experience  to  manage  risk. 


8.  Rockwell 

At  Rockwell,  an  improvement  on  communicating  risks  more  effectively  resulted 
the  following  benefits:  predictable  program  performance,  better  reviews,  improved 
process,  and  improvements  in  management  practices.  Three  key  elements  are  the  cause  of 
successful  risk  management  at  Rockwell:  repeatable  process,  widespread  access  to 
adequate  knowledge  and  functional  behavior  (defined  as  human  factors). 

Functional  behavior  implies  human  interactions,  motivations  and  incentives, 
perceptions  and  perspectives,  communication  and  consensus,  and  decision  making  and 
risk  tolerance.  (Gemmer,  1997)  identified  the  following  functional  behaviors:  a)  manage 
risk  as  an  asset,  b)  treat  decision  making  as  a  skill,  c)  active  seek  for  risk  information,  d) 
seek  diversity  in  perspectives  and  information  sources,  e)  minimize  imcertainty  on  time, 
control  and  information,  f)  recognize  and  minimize  bias  in  perceiving  risk,  g)  plan  for 
multiple  futures,  h)  be  proactive,  i)  improve  the  decision-making  skills,  and  j)  reward 
who  identify  and  manage  risks  early. 
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Gemmer  identified  the  following  causes  for  risks:  a)  uncertainty  in  time,  b) 
uncertainty  in  control,  and  c)  uncertainty  in  information.  Risk  management  is  usually  an 
uncertainty  scenario  characterized  by:  a)  uncertainty  in  the  impact  or  consequence,  b) 
there  exists  a  time  frame  to  prevent  or  mitigate,  c)  there  exists  a  coupling  or  domino 
effect,  d)  there  exists  uncertainty  about  the  probability  distribution  function  (Gemmer, 
1997). 


9.  Boehm 

Boehm  has  been  studying  the  problem  of  risk  management  for  more  than  a 
decade.  His  contributions  to  the  area  are  notable.  He  introduced  the  importance  of 
verification  and  validation  of  software  requirements  and  design  specifications  during 
early  phases  of  the  project  as  a  way  to  mitigate  risk  (Boehm,  1984).  Such  activities 
include:  a)  completeness,  b)  consistency,  c)  feasibility,  and  d)  testability  of  the 
specifications.  Completeness  implies  that  all  the  documents  and  references  exist  and  that 
there  are  no  missing  items,  functions  or  products.  Consistency  is  both  internal  and 
external,  and  implies  traceability.  Feasibility  requires  validate  that  the  project  can  be 
achieved  with  the  actual  resources,  that  it  will  satisfy  the  users'  needs,  that  it  will  be 
maintainable,  and  estimate  the  risk.  Testability  requires  imambiguous  and  quantitative 
specifications. 

Boehm  introduced  the  Spiral  model  (Boehm,  1988)  as  a  substitute  to  the  Royce's 
Waterfall  model.  The  Spiral  model  was  the  first  software  process  in  which  risk 
assessment  was  a  driving  factor.  The  author  recognized  however  that  there  exist 
difficulties  in  applying  his  model:  a)  matching  the  evolving  process  with  contracts;  b) 
relying  on  risk-assessment  expertise,  the  model  is  people  dependent  in  terms  of 
identification,  management  and  risk-driven  specification;  c)  the  need  of  further 
elaboration  in  the  spiral  steps  (Boehm,  1988);  d)  ambiguities  about  how  to  initiate, 
terminate  and  iterate  within  the  spiral;  e)  complexities  in  handling  incremental 
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development  such  as  refinements  from  previous  versions;  f)  difficulties  in  formalize 
processes;  and  h)  some  steps  result  more  complex  than  were  envisioned  (Boehm,  1988a). 

In  (Boehm,  1989)  and  (Boehm,  1991)  he  introduced  a  method  for  risk 
management  (Table  2.4).  Risk  management  is  divided  in  two  families  of  activities;  risk 
assessment  and  risk  control. 


Table  2.4:  Boehm's  classification  (Boehm,  1991) 

Risk  assessment  is  decomposed  into: 

(1)  Risk  identification  by  use  of  checklists,  decision  driver  analysis,  assumption 
analysis,  and  decomposition. 

a.  Checklist  (top  10  risks) 

•  Personnel  shortfalls. 

•  Unrealistic  schedules. 

•  Requirement  risks. 

•  Developing  the  wrong  functionality. 

•  Developing  the  wrong  user  interface. 

•  Developing  extra  functionality  not  essential  or  with  marginal 
usefulness. 

•  Continuous  stream  of  requirement  changes. 

•  Problems  in  external  components. 

•  Problems  in  external  tasks. 

•  Performance  shortfalls.  Straining  computer  science 
capabilities  (trying  to  do  more  than  the  possibilities  of  the 
state  of  the  art  technology):  distributed  processing,  AI, 
human-machine  interface,  algorithm  speed  and  accuracy, 
computer  security,  reliability  and  fault  tolerance. 

b.  Decision  driver  analysis: 

•  Politically  driven  decisions. 

•  Marketing  driven  decisions. 

•  Applying  the  wrong  solution  to  the  problem  because  there 
exist  compromises  or  preferences.  (Story  of  the  guy  that  was 
looking  for  his  keys  in  the  night.  He  was  looking  in  a 
different  spot  were  he  presumably  lost  the  keys,  but  this  spot 
was  under  a  light). 

•  Short-term  versus  long-term  decisions, 

c.  Assumption  analysis. 

•  Comparison  with  previous  experience. 

•  Pessimistic  approach  (Murphy's  Law). 
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d.  Decomposition 

•  Pareto  80-20  phenomena. 

•  Task  dependencies  (high  fan-in  implies  risk:  if  anything  slips 
the  project  aborts.  High  fan-out  also  implies  risk:  if  the 
precondition  slips  then  the  effect  is  in  many  parts  of  the 
project). 

•  Uncertainty  areas  in  the  plan. 

(2)  Risk  Analysis: 

a.  Decision  trees. 

b.  Network  analysis  using  PERT  and  probabilistic  network  analysis. 

c.  Cost  risk  analysis  using  COCOMO,  Putnam  or  other  estimation  tool 
for  effort  and  duration. 

d.  Automated  analysis  tools  (PROMAP,  PROSIM,  RISNET,  SLAM, 
Opera/Open  Plan,  PRISM,  REP). 

(3)  Risk  Prioritization: 

a.  Assess  the  risk  probabilities  from  historical  data,  Delphi  or  other  group 
technique. 

b.  Deal  with  compound  risks. 

c.  Deal  with  triggered  risks  (dominoes  effect). 

Risk  control  is  decomposed  into: 

(1)  Planning. 

(2)  Resolution. 

(3)  Monitoring  (milestone  tracking  and  top-10  risk  tracking). 


Boehm  alerted  that  current  approaches  to  the  software  process  make  have 
tendency  to  make  high-risk  commitments.  "The  waterfall  model  tempts  to  over  promise 
software  capabilities  in  contractually  binding  requirements  specifications  before 
analyzing  the  implications.  The  evolutionary  development  makes  too  easy  to  introduce 
new  ideas  and  requirements  that  can  lead  to  a  disaster."  (Boehm,  1991).  Recently  in  an 
article  coauthored  with  De  Marco  they  showed  a  pessimistic  and  pragmatic  stating  "doing 
software  risk  management  makes  good  sense,  but  talking  about  it  can  expose  you  to  legal 
liabilities.  If  a  software  product  fails,  the  existence  of  a  formal  risk  plan  that 
acknowledges  the  possibility  of  such  a  failure  could  complicate  and  even  compromise  the 
producer's  legal  position."  (Boehm,  1997). 
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Boehm's  contributions  to  risk  management  are  multiple.  This  research  picked  the 
most  important  ones  such  as  the  Spiral  model,  his  analysis  of  the  activities  required  for 
risk  management,  and  his  risk  management  method.  Due  to  its  relevance,  a  separate 
section  includes  the  discussion  about  the  Constructive  Const  Model  (COCOMO).  Despite 
his  contributions,  Boehm  recognizes  that  the  issue  of  relying  on  humans  to  assess  risk 
remains  unsolved.  The  use  of  checklists,  decision  driver  analysis,  assumption  analysis, 
and  decomposition  is  not  enough  to  automate  risk  identification  and  assessment. 


10.  McFarlan 

McFarlan  introduced  a  model  to  assess  risk  on  information  system  projects  based 
on  a  three-dimensional  checklist  covering  the  three  major  dimensions  influence  the  risk 
inherent  in  a  project:  a)  project  size  in  terms  of  budget,  staffing  levels,  elapsed  time  and 
number  of  departments  affected;  b)  experience  with  the  technology;  c)  project  structure  in 
terms  of  definition  of  the  tasks  and  deliverables  (McFarlan,  1974).  The  importance  of  his 
contribution  resides  in  the  identification  of  different  facets  on  software  projects.  This 
model  relies  on  checklists  and  in  the  experience  of  the  decision-maker  to  evaluate  risk. 


11.  Gilb 

In  his  classical  text  on  Software  Engineering  Management  (Gilb,  1988)  presented 
a  set  of  principles  or  rules  of  engagement  with  risk.  The  approach  is  informal.  Gilb's 
principles  are  heuristics  that  were  the  state  of  the  art  at  that  time.  His  work  was  included 
because  he  was  a  pioneer  in  recognizing,  the  problem  and  the  need  of  being  proactive. 


12.  USAF 

(USAF,  1988)  defines  risk  as  the  probability  at  a  given  point  in  a  system's  life 
cycle  that  predicted  goals  couldn't  be  achieved  with  the  available  resources.  Due  to  the 
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high  degree  of  uncertainty  high  precision  is  not  useful  during  the  early  phases.  As  the 
system  progresses  the  uncertainty  is  transformed  into  risk,  therefore  higher  precision  is 
required.  The  USAF  introduced  a  method  to  abate  risk  based  on  checklists  and 
estimations  of  probability  of  occurrence  and  effects.  They  decompose  the  softw^are  risk  in 
four  dimensions:  performance,  support  or  maintainability,  cost,  and  schedule.  The  effects 
on  the  project  are  categorized  into  catastrophic,  critical,  marginal  and  negligible.  The  four 
risk  dimensions  are  measured  in  terms  of  their  probability  of  occurrence  and  their  effect 
according  to  Table  2.5. 


Table  2.5:  USAF  scheme  for  risk 


Prob. 

1.0 -0.7 

0.7  -  0.4 

o 

O 

O 

0.0 

Impact 

Frequent 

Probable 

Improbable 

Impossible 

Catastrophic 

HIGH 

Critical 

1  MODERATE 

NONE 

Marginal 

V;:;  ^  ^ 

Negligible 

LOW 

The  USAF  method  is  very  simple  and  robust.  However,  it  is  informal,  relying  on 
checklists  and  experience  of  the  evaluator. 


D.  ESTIMATION  MODELS 

In  this  section  presents  three  models  to  estimate  effort  and  duration  so  software 
projects:  COCOMO,  Putnam  and  function  points.  The  importance  of  these  estimation 
models  resides  in  that  constitute  a  preliminary'  approach  to  assess  risk. 
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1.  The  COCOMO  Family 

COCOMO  (for  Constructive  Cost  Model)  was  introduced  by  (Boehm,  1981)  is  a 
family  of  models  constituted  by  Basic,  Intermediate,  and  Detailed  COCOMO.  Basic 
COCOMO  is  an  easy  to  calculate  model  applicable  to  small  to  medium  software  projects. 
Intermediate  COCOMO  is  based  on  the  Basic  model  and  includes  effort  adjustment 
factors.  The  detailed  COCOMO  accounts  the  influence  of  additional  factors  on  individual 
project  phases.  These  earlier  models  are  known  as  COCOMO  81. 

Projects  are  classified  into  three  categories:  a)  organic  which  are  characterized  by 
small  size,  small  teams  and  low  environmental  noise;  b)  embedded  characterized  by 
strong  complex  coupling  with  hardware  or  other  kind  of  tight  constraints  like  real  time 
systems;  and  c)  semidetached  which  are  intermediate  between  the  previous  two 
categories.  The  details  of  the  model  can  be  found  in  (Boehm,  1981),  but  it  is  important  to 
highlight  the  following  assumptions  that  show  the  optimistic  bias  of  the  model. 

•  The  development  period  considered  by  COCOMO  81  starts  at  the  beginning 
of  the  design  phase.  The  requirements  phase  is  not  covered. 

•  The  estimation  covers  only  the  direct-charged  labor.  Costs  related  to  computer 
center  operators,  secretaries,  higher  management,  and  support  are  excluded. 

•  The  model  assumes  that  a  man-month  is  1 52  hours  of  working  time. 

•  The  model  assumes  that  the  project  will  enjoy  of  good  management. 

•  Finally,  the  model  assumes  that  the  requirements  will  remain  unchanged. 

The  input  parameter  for  COCOMO  81  is  the  size  estimation  in  thousands  of  lines 
of  code  (KLOC),  which  constitutes  a  drawback  because  of  the  difficulty  of  predicting  the 
size  during  early  stages.  COCOMO  II  addresses  the  problem  of  size  estimation 
introducing  a  more  abstract  indicator  of  size  called  object  points  (a  variation  of  function 
points).  This  model  is  being  calibrated. 
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2.  Putnam 

In  the  50's,  Peter  Norden  from  IBM  developed  a  manpower  model.  He  used  the 
following  curve  of  the  Weibull  distribution  family,  named  after  the  1 9"’  century  physicist 
Lord  Rayleigh: 

y  =  K  (1  -  exp(-at")),  and  its  first  derivative 

y'  =  2  K  a  t  exp(-at^),  where 

y  =  cumulative  percentage  of  total  effort 

y'  =  manpower  rate  in  terms  of  people  per  unit  of  time 

K  =  effort  in  men-unit  of  time 

t  =  development  time 

a  =  a  constant  governing  the  time  to  manpower  peek. 

During  the  70's,  Putnam,  an  alumni  of  the  NPS,  introduced  a  model  applying  the 
concepts  developed  before  by  Norden  at  the  IBM  development  laboratory  of 
Poughkeepsie.  This  model  is  supported  by  a  commercial  tool  named  SLIM  (Software 
Life  Cycle  Management).  The  use  of  the  Rayleigh  curve  as  a  reasonably  good  fit  for  the 
manpower  distribution  has  been  proved  by  Norden,  (Putnam,  1980)  and  (Boehm,  1981). 
Putnam  observed  that  there  exist  a  strong  correlation  between  lines  of  code  and  schedule, 
manpower  and  defects.  He  recognized  differences  in  terms  of  development  difficulties 
between  real  time  systems  and  normal  information  systems  (Putnam,  1980  and  1996). 
Putnam's  model  is  based  on  the  following  assumptions  (Londeix,  1987): 

•  A  development  project  is  a  finite  sequence  of  purposeful,  temporally  ordered 
activities,  operating  on  an  homogeneous  set  of  problem  elements,  to  meet  a 
specified  set  of  objectives. 

•  The  number  of  problem  elements  is  unknown  but  finite. 

•  Problems  are  detected,  recognized  and  solved  applying  effort. 

•  The  occurrence  of  problem  solving  follows  a  Poisson  process. 

•  The  number  of  people  working  in  the  project  is  proportional  to  the  number  of 
problems  ready  to  solve  at  that  time. 

The  main  equation  of  this  model  relates  the  size  of  the  project  in  lines  of  code  to 
the  effort  and  the  schedule: 
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S  where 

S  =  number  of  delivered  source  instructions 
K  =  life-cycle  effort  in  man-years 
tj  =  development  time  in  years 

Q  =  a  "technology  constant" 

The  required  development  effort  (DE)  is  estimated  as  40%  of  the  life-cycle  effort. 

That  is: 

DE  =  0.4  K  =  0.4  (S/C,)' (1/C). 

One  difficulty  of  the  approach,  as  with  COCOMO,  is  the  requirement  of  kno'wing 

the  number  of  lines  of  code  at  the  beginning  of  the  project.  Putnam  suggests  the  use  of 

the  Delphi  method  to  estimate  S. 

Let  a  =  minimum  size  estimation, 
b  =  most  likely  size, 
c  =  maximum  size  estimation. 

The  estimator  of  the  expected  size,  E(S)  =  (a  -h  4b  +  c)  /  6. 

And  the  estimator  of  the  standard  deviation  is  s  =  (c  -  a)  /  6. 

Another  difficulty  is  to  estimate  the  technology  constant  C,.  Putnam  suggests 
deriving  it  from  previous  projects.  That  is,  analyzing  post-mortem  projects  with  known  S, 
K  and  t^  it  is  possible  to  derive  the  value  of  C,.  This  approach  introduce  two  constraints: 

•  To  apply  the  model  it  is  required  to  have  available  historic  data. 

•  The  development  process  must  be  repeatable,  that  is  at  least  CMM  level  2. 

(Boehm,  1981)  states  that  this  method  is  not  good  for  projects  employing 
incremental  development,  but  this  comment  could  be  a  little  biased.  Nevertheless, 
changes  in  requirements  lead  to  a  new  estimation.  According  to  its  author,  the  method  is 
not  precise  for  small  projects  with  development  time  of  two  years  or  less.  This  seems  to 
be  caused  to  a  more  rectangular  manpower  pattern  observed  in  small  projects.  The 
method  has  been  verified  with  more  than  4000  projects.  (Conte,  1986)  observed  also  that 
the  model  works  "reasonably  well"  on  very  large  systems  but  overestimates  the  effort  on 
medium  and  small  ones.  Other  criticisms  of  the  same  authors  are  exaggeration  of  the 
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effect  of  time  compression  on  the  development  effort,  excessive  weight  on  the  size,  and 
excessive  sensibility  to  changes  of  the  technological  constant. 


During  this  research  an  experiment  was  conducted  to  compare  Putnam's  model 
with  COCOMO  81.  The  experience  consisted  in  comparing  the  estimates  of  100  projects 
with  sizes  from  lOKLOC  to  IMLOC  using  Basic  COCOMO  for  organic,  semidetached 
and  embedded  systems  with  Putnam  estimation.  To  avoid  problems  of  tuning,  the  effort 
in  Putnam  was  based  using  the  average  of  the  development  times  of  COCOMO. 
Similarly,  the  time  in  Putnam  w^as  calculated  using  the  average  to  COCOMO  efforts. 
Both  cases  used  a  constant  of  technology  =  10100  as  suggested  in  (Boehm,  1981).  The 
following  graphs  show  the  findings: 

•  In  terms  of  effort  Putnam's  model  is  almost  the  average  of  embedded  and 
semidetached  COCOMO  (Fig.  2.3). 

•  In  terms  of  development  time,  the  models  are  quite  similar,  being  Putnam's 
estimation  more  optimistic  (Fig.  2.4). 


Effort  (COCOMO  vs  Putnam) 


Figure  2.3:  Effort  Estimated  Using  COCOMO  and  Putnam  Models 


38 


Dev.  Time  (COCOMO  vs  Putnam) 


Figure  2.4:  Development  Time  Estimated  Using  COCOMO  and 
Putnam  Models 

3.  Function  Points 

Functional  complexity  has  been  studied  for  years  because  It  is  highly  correlated 
with  effort  and  risk.  The  traditional  functional  complexity  metric  has  been  introduced  by 
(Albrecht  1979  and  1983),  Function  Points  had  an  enormous  success  because: 

•  It  is  an  early  metric.  It  can  be  calculated  after  the  preliminary  analysis  of  the 
system. 

•  It  is  easy  to  calculate.  There  are  only  five  input  parameters  to  compute  and 
fourteen  fine-tuning  adjustments,  and  tihe  whole  process  can  be  done 
manually. 

•  It  was  the  first  metric  that  related  complexity  to  number  of  lines  of  code. 

The  procedure  for  calculating  Function  Points  is  quite  simple.  It  is  required  to 
count  the  number  of  inputs,  outputs,  queries,  files,  and  system’s  interfaces.  Each  of  the 
five  parameters  is  classified  into  simple,  medium  or  complex.  Depending  on  the 
parameter  and  its  complexity  the  count  is  multiplied  by  a  weight  factor.  Table  2.6 
presents  the  template  for  the  calculation. 
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Table  2.6:  Function  Points  Calculation  (Albrecht,  1983) 


Simple  Weight  Medium  Weight  Complex  Weight  Total  | 

Inputs 

( 

*  3)  + 

{ 

*4)  + 

{ 

•6)  = 

Outputs 

( 

*4)  + 

( 

*5)  + 

( 

*7)  = 

Queries 

( 

*3j  + 

( 

*4)  + 

( 

*6)  = 

Files 

( 

*7)  + 

( 

*10)  + 

( 

*15)  = 

Interfaces  | 

i  ■  1 

i  *5)+  ; 

{ 

*7)+  I 

(  I 

1  *10)  = 

_ _ 1 

1 

NAFP  i 

= 

2  i 

The  result  of  the  total  is  called  Not  Adjusted  Function  Points  (NAFP).  Fourteen 
adjustment  factors,  whose  values  are  in  the  range  of  zero  to  five,  describing  the 
environment  are  added.  Finally  the  Function  Points  are  calculated  by  the  formula: 

FP  =  NAFP  *  (0.65  ^  0.01  *  Z  F^) 
where  NAFP  is  the  non  adjusted  Function  points 
Fj  is  each  of  the  fourteen  adjustment  factors 

Despite  its  attractive  approach,  Function  Points  has  many  weaknesses.  First  of  all, 
the  metric  was  derived  fi-om  a  study  of  MIS  projects  in  the  seventies.  Today,  there  are 
many  issues  that  are  not  considered  by  the  metric  and  that  are  contributors  to  complexity. 
For  instance,  recursive  functions,  reuse,  inheritance,  communication  by  messages,  and 
pol>morphism  are  not  covered  by  the  metric.  The  languages  have  evolved  also,  and  differ 
a  lot  from  the  COBOL  of  the  seventies.  Programming  styles  experienced  a  dramatic 
change  that  does  not  appear  in  the  metric. 

(Kemerer,  1993)  reported  some  weaknesses  of  the  metric.  Similar  results  have 
been  reported  by  (Kitchenham,  1993)  and  (Kitchenham,  1997).  The  main  issue  is  that 
function  points  is  a  not  well-formed  metric  because  there  is  a  correlation  between  their 
constituent  elements.  In  her  conclusions  she  stated  that: 

•  The  individual  function  point  elements  were  not  independent. 
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•  Not  all  the  function  point  elements  are  related  to  effort. 

•  An  effort  prediction  metric  based  on  inputs  and  outputs  is  just  as  good 
predictor  as  function  points. 

•  An  effort  prediction  metric  based  on  the  number  of  files  and  the  number  of 
outputs  was  only  slightly  worse  that  Fimction  Points. 

•  To  get  good  estimates  it  is  necessary  estimation  methods  and  models  based  on 
the  organization's  performance,  working  practices,  and  software  experience. 

•  Uncertainty  and  risk  cannot  be  managed  effectively  at  the  individual  project 
level,  but  in  the  organization  context.  If  a  single  project  had  to  ensure  against 
all  possible  risks  and  uncertainty,  its  cost  would  be  prohibitive.  The  sources 
for  estimate  uncertainty  are:  a)  measurement  error  caused  because  some  of  the 
input  variables  in  a  model  have  inherent  accuracy  limitations;  b)  model  error 
caused  because  no  estimation  model  can  include  all  the  factors  involved;  c) 
assumption  error  caused  because  some  of  the  hypotheses  about  input 
parameters  are  incorrect;  and  d)  scope  error  caused  because  the  project  under 
study  is  outside  the  model's  domain. 

Even  if  there  was  evidence  of  defects  in  the  metric,  nobody  introduced  a  better 
alternative.  So,  function  points  remained  for  many  years  as  the  most  common  prediction 
metric.  More  recently,  some  extensions  to  function  points  have  been  introduced  such  as 
“feature  points”  and  “Boeing’s  3-F  function  points”  addressing  the  effort  estimation  for 
embedded  systems. 


4.  Conclusions  about  COCOMO,  Putnam  and  Function  Points 

All  these  methodologies  have  some  weaknesses  with  respect  to  software 
evolution.  First,  the  need  of  a  size  estimate  as  an  input  parameter  limited  the  applicability 
of  COCOMO  and  Putnam  methods.  Second,  the  characteristics  coxmted  on  function 
points  are  quite  different  than  the  specification  attributes.  Third,  the  criticisms  introduced 
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by  (Kemerer,  1993;  Kitchenham,  1993  and  1997)  suggested  that  despite  the  correlation 
observed  between  complexity  and  size,  other  metrics  could  be  more  accurate  on  this 
metric. 

E.  MODERN  PROJECT  MANAGEMENT  TECHNQUES:  ViteProject 

ViteProject  is  a  modeling  and  simulation  tool  that  integrates  the  organizational 
work  of  projects  explicating  the  interdependencies  between  tasks  and  roles  not  only  from 
the  point  of  view  of  producer-consumer  such  as  in  CPM  or  Pert,  but  also  communication 
and  rework  dependencies.  ViteProject  is  the  commercial  version  of  VDT  (Virtual  Design 
Tool),  a  research  based  on  contingency  theory  directed  by  Dr.  Raymond  Levitt  at 
Stanford  (Jin,  1996). 

CPM  models  are  sequential  interdependencies  through  explicit  representation  of 
precedence  relationships  between  activities.  This  simplified  vision  of  the  project  cannot 
address  the  dynamics  created  by  reciprocal  requirements  of  information  in  concurrent 
activities,  exceptions  management,  and  the  impacts  of  actor  interactions. 

The  original  model  of  VDT  was  based  on  the  following  observations  about 
collaborative,  multidisciplinary  work  in  large  complex  projects: 

•  Organizational  tasks  in  the  project  can  be  divided  into  two  categories: 
production  work  that  directly  adds  value  to  the  product,  and  coordination 
work  that  facilitates  the  previous  one. 

•  Contingency  theory  provides  qualitative  insights  about  the  extent  of 
coordination  work,  but  did  not  provide  information  about  how  to  address  the 
bottleneck  problems  created  by  coordination. 

The  model  integrates  the  micro  level  description  of  the  entities  that  perform  work 
and  process  information  called  "actors".  Actors  can  be  individuals  or  small  teams  acting 
as  a  unique  and  cohesive  unit  where  individuals  are  not  differentiated.  Actors  have  two 
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basic  behaviors:  attention  allocation  and  information  processing.  As  a  consequence  of 
such  behaviors,  actors  perform  production  and  coordination.  The  model  is  based  on  the 
folloAving  assumptions: 

•  Actor  allocation  assumption:  Each  actor  has  one  input  buffer  where  all  the 
incoming  information  and  requests  for  production  or  coordination  work  arrive. 
The  input  buffer  is  a  queue  that  supports  different  policies:  priorities,  FIFO, 
and  random.  Each  actor  has  also  an  output  buffer  to  place  its  accomplished 
work. 

•  Actor  capacity  allocation  assumption:  An  actor  has  certain  information¬ 
processing  capacity  determined  by  its  skill  type,  skill  level,  and  allocable  time. 
An  information  processing  work  can  be  processed  and  completed  if  the  actor 
allocates  sufficient  capacity  to  the  job.  This  assumption  implies:  a) 
information  processing  requires  not  only  attention  but  also  takes  time;  b)  the 
information  content  of  a  work  is  related  to  the  skills;  c)  the  volume  of  a  work 
is  related  to  the  time;  d)  actors  have  limited  capacity  to  allocate. 

•  Actors  cannot  allocate  100%  of  its  capacity  to  work  because  they  are 
interrupted  by:  a)  information  requests  from  other  actors;  b)  decision-making 
to  solve  exceptions  produced  by  subordinate  actors;  c)  meetings;  and  d) 
processing  noise,  that  is  ail  other  interruption  created  outside  the  project  that 
have  impact  on  the  actor. 

The  organization  structure  is  modeled  through  simulation.  The  organization 
variables  such  as  control  structure,  communication  structure,  formalization  and  matrix 
strength,  influence  actor's  micro  level  actions,  and  consequently  an  organization's 
emergent  performance  appears.  The  use  of  complex  adaptive  systems  to  model 
organizational  behavior  has  been  discussed  also  by  (Brown,  1998). 
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F. 


ORGANIZATIONAL  THEORY 


This  section  introduces  some  foundations  of  organizational  theory  that  support  the 
research.  Why  to  review  the  organizational  foundation  if  the  research  is  about  software 
engineering?  First,  software  development  requires  teamwork,  more  specifically  organized 
work.  So  it  is  required  to  understand  the  dynamics  of  organizations  as  artificial  social 
entities  that  exist  to  achieve  a  specific  purpose,  in  this  case  to  develop  software.  Second, 
organizations  are  made  up  of  individuals  who  accomplish  diverse  desegregate  activities 
that  require  coordination  and  consequently  information  exchange.  These  two  activities, 
despite  their  impact,  have  not  been  covered  by  the  research  in  estimation  models.  Third, 
ViteProject  is  customized  for  general  projects.  In  order  to  obtain  a  rigorous  simulation,  it 
is  required  to  customize  the  tool  according  to  the  characteristics  of  software  engineering. 

1.  Introduction 

As  software  systems  increased  in  complexity,  software  development  evolved  form 
a  primitive  art  into  software  engineering.  Methodologies  and  software  tools  were 
developed  to  help  development  processes.  Most  of  the  present  tendencies  (DOD-STD- 
2167A,  ISO-9001,  SEI/CMM)  try  to  standardize  processes,  emphasizing  planning  and 
structure  (Humphrey,  1990).  Some  authors  criticize  those  approaches  stating  that  they 
underestimate  the  dynamics  of  the  software  development  (Bach,  1994),  (Abdel-Hamid, 
1 997).  Others  question  that  activities  such  as  research  and  development  are  not  addressed 
by  TQM  principles  (Dooley  et  al.,  1994).  In  the  author's  opinion,  many  of  the  problems 
on  current  software  projects  have  organizational  roots.  This  view  is  also  supported  by 
(van  Genutchen,  1991)'  and  (Capers  Jones,  1994)^.  The  typical  software  engineering 
process  is  a  succession  of  decision  problems  trying  to  transform  a  set  of  fuzzy 
expectations  into  requirements,  specifications,  designs  and  finally  code  and 
documentation.  The  traditional  waterfall  software  process  failed  to  accomplish  their 
purpose  because  it  applied  a  method  valid  for  well-defined  and  quasi-static  scenarios. 

'  Van  Genuchten  found  that  45%  of  all  the  causes  for  delayed  software  are  related  to  organizational  issues. 

^  Capers  Jones  found  that  on  military  software  developments  the  two  more  common  threats  are  excessive 
paperwork  (90%  of  the  time)  and  low  productivity  (85%  of  the  time). 
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This  hypothesis  is  far  from  the  reality.  Today,  modem  software  processes  (Boehm, 
1988)(Luqi,  1989)  are  based  on  evolution  and  prototyping.  These  approaches  recognize 
the  fact  that  software  development  presents  an  ill-defined  decision  problem  and  they  fail 
in  assessing  automatically  the  risk.  In  the  author’s  view,  software  development  projects 
present  special  characteristics  that  require  to  be  solved  in  order  to  achieve  an 
improvement  in  the  state  of  the  art.  These  particularities  affect  the  strategic  planning, 
the  oi^anizational  structure,  and  the  engineering  applied  to  software.  In  these  three 
areas  chaos  theoiy'  can  provide  clues  for  possible  solutions. 


(Woodward,  1965)  has  studied  the  relationship  between  technological  complexity 
and  stmcture,  classifying  the  technology  into  three  types:  a)  unit  (custom  made  and  non- 
routine  jobs),  b)  mass  (large  batch  or  mass  production  in  assembly  lines),  and  c)  process 
(highly  controlled,  standardized  and  continuous  processing  such  as  refineries).  This 
scheme  was  created  for  the  manufacturing  industry  and  it  is  not  very  suitable  for  software 
engineering.  However,  it  has  some  characteristics  of  unit  and  process  technologies:  high 
proportion  of  skilled  wwkers,  low  formalization  and  low  centralization. 


Perrow  (Burton,  1998)  introduced 
a  two-dimensional  classification  of  the 
technology  (Fig.  2.5).  The  first  dimension 
is  the  analyzability  of  the  problem 
varying  from  well-defined  to  ill-defined. 

The  second  dimension  is  the  task 
variability,  which  means  the  number  of 
expected  exceptions  in  the  tasks.  The 
scheme  lacks  of  a  third  dimension 
representing  time.  Hence,  in  this 
projection,  softw-are  engineering  occupies  part  of  the  non-routine  and  of  the 
engineering  regions.  During  the  earlier  phases  of  the  development  usually  the  problem  is 
ill-defined.  That  is  why  the  requirements  phase  is  so  prone  to  errors.  After  several 


Task  variabiiy 

Figure  2.5:  Perrow’s  classification 
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prototypes  and  evolution  cycles  the  problem  is  transformed  into  weli-defmed  and  the 
system  can  be  specified.  This  is  a  key  difference  with  other  forms  of  engineering  already 
discussed  in  the  first  chapter.  Highly  skilled  personnel,  low  formalization  and 
centralization,  high  information  processing  demand,  and  coordination  obtained  through 
meetings  characterize  the  organizations  in  this  region. 


A  second  line  of  research  (Burton  and  Obel,  1998  pp.  174-180),  introduced  a 
classification  based  on  four-variable  model:  equivocality,  environmental  complexity, 
uncertainty  and  hostility.  Equivocality  is  "the  existence  of  multiple  and  conflicting 
interpretations”,  it  is  a  measure  of  the  lack  of  knowledge  or  the  level  of  ignorance 
whether  a  variable  exists  in  the  space.  Uncertainty  is  the  lack  of  knowledge  about  the 
likelihood  of  values  for  the  known  variables.  Environmental  complexity  is  the  number  of 
factors  in  the  environment  affecting  the  organization  and  their  interdependency.  Finally, 
hostility  is  'The  level  of  competition  and  how  malevolent  the  environment  is.”  In  Table 
2.7,  the  fourth  variable,  hostility,  w^as  disregarded  because  when  hostility  grows  over  a 
certain  threshold,  it  overrules  other  factors  (Burton  &  Obel,  1998  pp.  177).  In  highly 
hostility  scenarios  only  a  highly  centralized  organization  (“regular  army”),  or  a  low- 
formal-low-complex  organization  (“guerilla”)  are  the  possible  alternatives. 


Table  2.7:  Burton  &  Obel’s  scheme  (Adapted  from  Burton  &  Obel,  1998  pp  181-182) 


Equivocality 

Enviromental 

Complexity 

Uncertainty 

Formalization 

Organizational 

Complexity 

Centralization 

Low 

Low 

Low 

High 

Medium 

High 

Low 

Low 

High 

Medium 

High 

Medium 

Low 

High 

Low 

High 

Medium 

Medium 

Low 

High 

High 

Medium 

High 

Low 

High 

Low 

Low 

Medium 

Medium 

High 

High 

Low 

High 

Low 

1  Low 

High 

High 

High 

Low 

Medium 

1  Medium 

Low 

High 

;  High 

Low 

Low 

Low 

Software  development  scenarios  usually  correspond  to  high  equivocality  that 
decreases  over  time,  high  environmental  complexity  and  high  uncertainty  scenarios  (dark 
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gray  in  Table  2.7),  which  corresponds  to  low  formalization  and  low  organizational 
complexity,  with  centralization  inverse  to  the  environmental  complexity.  The 
recommended  organization  could  be  ad  hoc  or  matrix  with  coordination  by  integrator  or 
group  meeting.  The  information  exchange  is  rich  and  abundant.  The  incentive  policy 
should  be  based  on  results.  These  parameters  constitute  the  key  points  to  customize  the 
behavior  matrix  of  ViteProject  to  software  developments. 


2.  The  Edge  of  Chaos 

Chaos  theory  describes  a  specific  range  of  irregular  behaviors  in  systems  that 
move  or  change  (James,  1996).  Chaotic  does  not  mean  random.  The  primary  feature 
distinguishing  chaotic  fi'om  random  behavior  is  the  existence  of  one  ore  more  attractors. 
Without  the  existence  of  such  attractors  the  quasi-chaotic  scenarios  could  not  be 
repeatable.  It  is  important  to  realize  that  a  chaotic  system  must  be  bounded,  nonlinear, 
non-periodic  and  sensitive  to  small  disturbances  and  mixing.  A  system  that  has  all  these 
properties  can  be  driven  into  chaos.  The  edge  of  chaos  is  defined  as  "a  natural  state 
between  order  and  chaos,  a  grand  compromise  between  structure  and  surprise"  (James, 
1996).  The  edge  of  chaos  can  be  visualized  as  an  unstable  partially  structured  state  of  the 
universe.  It  is  unstable  because  it  is  constantly  attracted  to  the  chaos  or  to  the  absolute 
order.  Usually  people  have  the  tendency  to  think  that  the  order  is  the  ideal  state  of  nature. 
This  could  be  a  big  mistake.  Research  on  organizational  theory  (Stacey,  Nonaka, 
Zimmerman);  Management  (Stacey,  Levy);  and  economics  (Arthur)  support  the  theory 
that  operation  away  from  equilibrium  generates  creativity,  self-organization  processes 
and  increasing  returns  (Roos,  1996). 

Change  occurs  when  there  is  some  structure  so  that  the  change  can  be  organized, 
but  not  so  rigid  that  it  caimot  occur.  Too  much  chaos,  on  the  other  hand,  can  make 
impossible  coordination  and  coherence.  Lack  of  structure  does  not  always  mean  disorder. 
Let  illustrate  this  idea  with  an  example.  A  flock  of  migratory  ducks  in  a  lake  has  little 
structure.  However,  a  few  minutes  after  they  start  flying  some  order  appear  and  the  flock 
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creates  a  V-shape  formation.  This  self-organized  behavior  occurs  because  a  loose  form  of 
structure  exists.  Experiments  with  intelligent  agents  governed  by  three  rules  (a)  try  to 
maintain  a  minimum  distance  from  the  other  objects  in  the  environment,  including  other 
agents;  b)  try  to  match  the  speed  of  other  agents  in  the  vicinity;  and  c)  try  to  move  toward 
the  perceived  center  of  mass  of  the  agents  in  the  vicinity),  show  the  same  macro  behavior. 
Independently  of  the  starting  position  of  the  agents,  they  always  end  up  in  a  flock.  Even  if 
an  obstacle  disturbs  the  formation,  the  pseudo-order  is  recovered  some  time  later.  This 
self-organized  behavior  emerges  despite  the  absence  of  leadership  and  without  an  explicit 
order  to  form  a  flock. 


A  more  interesting  example  is  the  behavior  of  software  development  teams.  A 
recent  article  (Cusumano,  1997),  describes  the  strategies  of  Microsoft  to  manage  large 
teams  as  small  teams.  Dr.  Cusumano  says  "What  Microsoft  tries  to  do  is  allow  many 
small  teams  and  individuals  enough  freedom  to  work  in  parallel  yet  still  function  as  one 
large  team,  so  they  can  build  large-scale  products  relatively  quickly  and  cheaply.  The 
teams  adhere  to  a  few  rigid  rules  that  enforce  a  high  degree  of  coordination  and 
commimication."  This  seems  to  be  a  description  of  the  emerging  behavior  in  a  complex 
adaptive  system.  It  is  self-adaptive  because  the  agents  realize  the  adjustment  to  the 
environment,  and  it  is  emergent  because  it  arises  from  the  system  and  can  only  be  partly 
predicted.  As  in  the  example  of  the  ducks,  a  few  rules  of  interaction  between  the  agents 
(in  this  case  people)  generate  a  efficient  behavior.  The  three  rigid  rules  at  Microsoft  are: 
a)  developers  integrate  their  work  daily  forcing  the  synchronization  and  testing  of  the 
work;  b)  developers  responsible  for  bugs  must  fix  them  immediately,  and  are  responsible 
for  the  next  day  integration;  and  c)  milestone  stabilization  is  sacred.  Another  possible 
explanation  of  Cusumano's  observations  could  be  the  presence  of  an  underlying  structure 
that  propitiates  the  creativity  and  productivity. 


Complex  adaptive  systems,  as  the  one  just  described,  are  made  up  with  multiple 
interacting  agents.  The  emergence  of  the  complex  behavior  requires  some  conditions.  The 
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first  condition  is  the  existence  of  more  than  one  agent.  A  second  condition  is  that  agents 
must  be  sufficiently  different  to  each  other  that  their  behavior  is  not  exactly  the  same  in 
all  cases.  When  agents  behave  exactly  the  same  vv^ay  exhibit  predictable,  not  complex, 
behavior.  Finally,  a  third  condition  is  required.  Complex  adaptive  behavior  only  occurs  in 
the  edge  of  chaos. 


3.  Some  of  the  Risks  of  Being  in  the  Edge  of  Chaos 

Limiting  the  structure  in  organizations  can  be  useful  in  situations  when  innovation 
is  critical  or  when  is  required  to  revitalize  bureaucracies.  However,  if  the  structure  is 
debilitated  beyond  a  certain  minimum,  it  leads  to  an  undesired  state.  Some  traits  can  alert 
the  eminence  of  such  anarchic  situation  known  as  the  “chaos  trap”  (Brovra  &  Eisenhardt^, 
1999):  a)  emerging  of  a  rule-breaking  culture,  b)  missing  deadlines  and  unclear 
responsibilities  and  goals,  and  c)  random  communication  flows. 

On  the  other  hand,  focusing  in  hierarchy  and  disciplined  processes,  emphasis  on 
schedules,  planning  and  job  descriptions  may  lead  to  a  steady  inert  bureaucracy. 
Organizations  in  such  a  state  react  too  late  failing  to  capture  shifting  strategic 
opportunities.  This  is  the  case  of  a  “bureaucratic  trap”,  where  there  are  also  some 
observable  warning  traits:  a)  rule-following  culture,  b)  rigid  structure,  tight  processes  and 
job  definitions,  and  c)  formal  communication  as  the  only  channel. 

The  alternative  is  “surfing”  the  edge  of  chaos  avoiding  both  attractors.  That 
requires  limited  structure  combined  with  intense  interaction  between  the  agents,  giving 
enough  flexibility  to  develop  surprising  and  adaptive  behavior.  Organizations  in  this  state 
are  characterized  by  having  an  adaptive  culture.  People  expect  and  anticipate  changes.  A 
second  characteristic  is  that  the  few  key  existing  structures  are  never  violated.  Finally, 
real  time  communication  is  required  throughout  the  entire  organization. 


^  Kathleen  Eisenhardt  is  a  NFS  alumni. 
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Being  in  the  edge  of  the  chaos  implies  an  unstable  position.  Some  perturbations 
can  cause  the  rupture  of  this  delicate  equilibrium  and  the  fall  into  one  of  the  two  steady 
states.  A  potential  perturbation  factor  is  the  organizational  collaboration  style.  Too  much 
collaboration  can  disturb  the  performance  of  each  agent  and  consequently,  the  whole 
system  is  affected.  On  the  other  hand,  too  little  collaboration  destroys  the  advantage  of 
acting  organized  and  leads  to  paralysis.  Other  sources  of  perturbation  are  the  tendency  to 
be  tight  to  the  past  and  cultural  idiosyncrasy,  or  by  contrary,  to  loose  the  link  with  the 
past.  In  one  case,  the  change  becomes  impossible.  In  the  other  case,  the  assets  from 
previous  experiences  are  not  capitalized.  The  equilibrium  point  is  called  regeneration.  In 
such  unstable  state,  mutation  can  occur.  Therefore  the  inherited  characteristics  that  give 
competitive  advantage  in  a  certain  scenario  can  be  perpetuated,  and  new  variations  are 
introduced.  If  too  little  variation  exists,  natural  selection  fails.  This  process  permits  that 
complex  adaptive  systems  change  over  the  time  following  a  Darwinian  pattern. 

(Kauffman,  1995)  introduced  the  concept  of  fitness  landscape.  This  concept  can 
be  understood  by  observing  the  behavior  of  species.  In  the  competition  for  survival, 
species  attempt  to  alter  their  genetic  make-up  by  taking  adaptation  trying  to  move  to 
higher  "fitness  points"  where  their  viability  will  be  enhanced.  Species  that  are  not  able  to 
reach  higher  points  on  their  landscapes  may  be  outpaced  by  competitors  who  are  more 
successful  in  doing  so.  If  that  occurs  the  risk  of  extinction  increases.  The  same  principle 
applies  between  predator  and  prey.  Each  development  in  the  abilities  of  one  species 
generates  an  improvement  on  the  abilities  of  the  other.  This  concept  is  called  co¬ 
evolution.  Certain  higher  fitness  points  have  more  value  to  some  species  than  to  others. 
The  contribution  a  new  gene  can  make  to  a  species’  fitness  depends  on  genes  the  species 
already  has.  As  more  complicated  is  the  genetic  pattern  (more  evolved),  the  probability  of 
conflict  of  a  new  adaptation  increases  slowing  down  the  speed  of  variations. 

Natural  selection  is  an  effective,  but  not  generally  efficient  way  to  evolve  (Brown 
&  Eisenhardt,  1 998).  The  process  requires  some  amount  of  mutation  to  avoid  the  sudden 
convergence  on  suboptimal  characteristics.  Some  of  the  characteristics  lost  in  the  past  can 
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be  reintroduced  being  useful  in  the  new  scenario.  Many  errors  are  committed  during  this 
blind  process.  A  more  efficient  way  to  evolve  is  by  recombination  of  the  pool  of  genes 
using  genetic  algorithms.  This  technique  has  been  applied  to  improve  the  performance  of 
robots,  however  the  idea  can  be  used  to  improve  the  competencies  of  organizations.  If  too 
much  or  too  less  variation  occurs  the  result  always  conduct  to  the  failure  of  the  system. 


4.  The  Strategic  Planning  Issue 

Traditional  approaches  to  strategic  planning  emphasize  picking  a  unique  strategy 
according  to  the  competitive  advantages  of  each  organization.  Porter’s  five-force 
approach  (Porter,  1980),  assumes  that  there  exists  some  degree  of  accuracy  in  the 
prediction  of  which  industries  and  which  strategic  positions  are  viable  and  for  how  long. 
In  a  high-velocity  scenario  the  assumption  of  a  stable  environment  is  too  restrictive. 
Customers,  providers,  competitors,  and  potential  competitors,  as  well  as  substitute 
products  are  evolving  faster  than  expected.  The  introduction  of  new  information 
technology  tools,  the  Internet  and  the  globalization  of  the  markets  are  contributing  to  this 
phenomenon,  and  nothing  seems  to  reverse  the  process.  The  failure  of  long-term  strategic 
planning  is  not  a  failure  of  management,  it  is  the  normal  outcome  in  a  complex  and 
unpredictable  environment.  A  growing  number  of  consultants  and  academics  (Santosus, 
1998)(Brown  and  Eisenhardt,  1999)  are  looking  at  complexity  theory,  to  help  decision¬ 
makers  improve  the  way  they  lead  organizations. 

How  useful  could  a  map  of  a  territory  that  is  constantly  changing  its  topography 
be?  In  fast  changing  environments,  survival  requires  a  refined  ability  to  sense  the  external 
variables.  Traditional  approaches  rely  on  strategic  planning  and  vision.  However,  in 
unstable  environments  planning  would  not  be  effective  because  it  is  impossible  to  predict 
the  scenario's  evolution  in  terms  of  markets,  technologies,  customer's  needs,  etc. 
Organizations  relying  only  on  one  vision  supported  by  a  tight  planning,  risk  paying  little 
attention  to  the  future.  Consequently,  their  sensing  organs  are  blind  to  foresight  of  the 
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future.  A  certain  amount  of  inertia  and  commitment  to  the  plans  is  required  to  prevent 
erratic  changes  caused  by  reaction  diverse  variables. 

If  the  time  window  of  the  opportunities  is  shrinking,  a  different  form  of  thinking 
is  required.  The  present  technological  situation  can  be  described  as  a  fast  succession  of 
short-term  niches.  The  ability  to  change  is  the  key  of  success  for  surviving  in  such  a 
variable  environment.  In  a  systemic  approach,  the  General  Systems  Theory  establishes 
that  organizations  are  systems  whose  viability  depends  on  some  basic  behaviors  (von 
Bertalanfy,  1976): 

•  Ability  to  sense  changes  in  the  environment.  This  is  the  most  primitive  form 
of  intelligence,  if  it  is  not  present  the  probabilities  of  survive  are  minimum. 

•  Ability  to  adapt  to  a  new  environment  modifying  the  internal  structure  and 
behavior.  The  system  tries  to  auto-regulate  to  survive  the  crisis  in  hostile 
scenarios,  or  take  advantage  of  the  opportunities  in  favorable  ones. 

•  Ability  to  learn  from  the  past,  anticipating  the  auto-regulation  behaviors  and 
structure  before  the  environment  change.  This  ability  requires  intelligence  able 
to  infer  conclusions  from  the  past  according  to  the  context  of  the  variables 
sensed  on  the  present. 

•  Ability  to  introduce  changes  in  the  environment,  making  it  more  favorable  to 
the  system's  needs.  In  this  case,  the  system  has  developed  the  technology 
(know  how  and  tools)  to  exert  power  over  the  environment. 

Any  mechanical  or  computing  system  has  some  or  all  of  these  abilities.  These 
same  abilities  could  be  found  in  any  form  of  life.  The  more  developed  the  system  is,  the 
more  of  the  above  characteristics  has.  Darwin's  Evolution  Theory  validates  this  line  of 
reasoning.  Natural  selection,  acting  on  inherited  genetic  variation  through  successive 
generations  over  the  time  is  the  form  of  evolution.  Variation  is  the  way  used  by  biological 
systems  to  probe  the  environment  presenting  many  alternatives,  some  of  them  ending  on 
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failure  but  a  few  very  successful.  This  process  is  an  inefficient  but  very  effective  way  of 
improvement. 


Experiments  can  provide  a  certain  amount  of  knowledge  about  the  future.  In  some 
sense,  probes  are  mutations  in  small  scale  that  can  cause  only  small  losses.  The  results 
give  insights  to  discover  new  options  to  compete  in  the  future  and  stimulate  creative 
thinking.  The  research  investment  pays  dividends  when  a  new  way  of  competition  is 
discovered  altering  the  status  quo's  rules.  When  the  changes  in  the  environment  occur  too 
fast,  sensing  the  variables  becomes  more  difficult.  It  is  possible  that  a  specialized  organ 
was  not  able  to  react  on  time  to  record  the  metric  and  transmit  the  alert.  In  this  case,  the 
system  starts  to  lose  information  threatening  its  own  viability.  When  the  changes  in  the 
environment  are  too  drastic,  even  if  the  sensor  organs  detect  the  change,  the  inference 
organs  may  not  be  able  to  determine  an  effective  course  of  action  because  they  do  not 
have  a  previous  experience,  or  because  the  decision-making  process  requires  more  time. 
This  situation  also  threats  the  viability  of  the  system  in  the  long  run.  The  effects  of  drastic 
variations  and  high  rate  of  change  over  systems  can  be  visualized  with  simple 
experiments:  a)  increasing  the  speed  of  transmission  in  a  communication  channel  beyond 
some  limit  will  provoke  the  lost  of  part  or  the  entire  message,  b)  modifying  the  pH  in  the 
soil  beyond  a  certain  limit  can  cause  the  death  of  a  plant.  The  same  syndrome  can  be 
recognized  in  any  type  of  organization.  It  is  possible  to  employ  a  new  strategy. 
"Competing  on  the  Edge"  is  a  new  theory  defines  strategy  as  the  creation  of  a  relentless 
flow  of  competitive  advantages  that,  taken  together,  form  a  semi-coherent  strategic 
direction  (Brown  &  Eisenhardt,  1998).  The  key  driver  for  superior  performance  is  the 
ability  to  change,  reinventing  the  organization  constantly  over  the  time.  This  factor  of 
success  can  be  applied  to  software  engineering  as  well  as  to  other  decision  problems  with 
similar  characteristics. 

If  the  environment  is  moving,  like  in  surfing,  the  best  way  to  remain  in 
equilibrium  is  by  being  in  the  rhythm.  Successful  corporations  such  as  Intel  or  Microsoft 
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are  in  perpetual  movement,  launching  new  products  with  certain  rhythm.  Intel  is  faithful 
to  its  founder's  (Moore)  law:  the  power  of  the  microprocessors  double  every  eighteen 
months.  Microsoft  has  a  proportional  pace  on  the  software  sector.  The  challenges 
imposed  by  hyper  competition  create  similar  characteristics  than  in  software  engineering 
developments.  So,  the  rules  of  engagement  proved  effective  for  one  discipline  could 
result  useful  in  the  other. 


5.  Application  in  Software  Engineering 

Chaos  in  software  development  comes  from  various  sources:  a)  the  intrinsic 
variable  nature  of  requirements,  b)  the  changes  introduced  by  new  technologies,  c)  the 
dynamics  of  the  software  process,  and  d)  the  complex  nature  of  human  interaction.  These 
conditions  are  sufficient  for  the  development  of  complex  adaptive  systems  where  the 
agents  are  software  developers  or  parallel  collaborative  projects.  Software  development 
scenarios  usually  have  high  equivocality,  high  environmental  complexity  and  high 
uncertainty.  The  suggested  organizational  structure  to  deal  with  such  scenarios  (Burton 
and  Obel,  1998)  should  have  low  formalization  and  organizational  complexity, 
centralization  inverse  to  the  environmental  complexity,  and  rich  and  abundant 
information  exchange.  The  recommended  organization  should  be  ad  hoc  or  matrix,  with 
coordination  by  integrator  or  group  meeting.  This  organizational  style  is  difficult  to 
achieve  when  the  organizations  are  large. 

A  simple  solution  can  be  recognized  at  Microsoft  (Cusumano,  1997):  a)  parallel 
developments  by  small  teams  with  continuous  synchronization  and  periodically 
stabilization,  b)  software  evolution  processes  where  the  product  acquires  new  features  in 
increments  as  the  project  proceeds  rather  than  at  the  end  of  a  project,  c)  testing  conducted 
in  parallel  as  part  of  the  evolution  process,  and  d)  focus  creativity  by  evolving  features 
and  "fixing"  resources.  Cusumano  observed  that  small  development  teams  were  more 
productive  because:  a)  fewer  people  on  a  team  have  better  communication  and 
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consistency  of  ideas  than  large  teams,  and  b)  in  research,  engineering  and  intellectual 
work  individual  productivity  has  big  variance.  Software  development  requires  teamwork, 
more  specifically  organized  work.  So  it  is  necessary  to  understand  the  dynamics  of 
organizations  as  artificial  social  entities  that  exist  to  achieve  a  specific  purpose,  in  this 
case  to  develop  software.  Such  organizations  are  made  up  of  individuals  who  accomplish 
diverse  desegregate  activities  that  require  coordination  and  consequently  information 
exchange. 


In  order  to  apply  this  approach  three  factors  should  be  resolved.  First,  automated 
risk  assessment  is  required  (the  topic  of  this  research).  Second,  evolutionary  software 
processes  should  establish  the  maximum  speed  of  the  evolution.  If  the  evolutions  occur 
too  fast,  without  a  period  of  relaxation,  it  is  certain  that  the  process  will  fall  into  chaos. 
On  the  other  hand  if  the  speed  is  too  slow  then  the  productivity  could  result  affected.  The 
correct  rhythm  for  software  processes  has  not  been  researched  and  remains  on  the  hands 
of  the  project  manager.  Third,  software  processes  should  be  focused  on  flexibility  and 
extensibility  rather  than  in  high  quality.  This  assertion  soimds  scary.  However,  it  is 
necessaiy  to  prioritize  the  speed  of  the  development  over  zero  defects.  Extending  the 
development  in  order  to  reach  high  quality  could  result  in  a  late  delivery  of  the  product, 
when  the  opportunity  niche  has  disappeared.  This  paradigm  shift  is  imposed  by  the 
competition  on  the  edge  of  chaos. 

A  shift  fi"om  the  traditional  long-term  development  organizations  is  required. 
Virtual  teams  created  as  temporary  dynamic  project-oriented  structures,  with  a 
composition  of  skills  matching  exactly  the  objectives  could  improve  the  current 
performances.  Such  virtual  organizations  are  not  exposed  to  bureaucratic  loads  and  do  not 
require  to  absorb  the  cost  of  permanent  staff  (Senegupta  and  Jones,  1999).  Larger 
developments  could  be  achieved  by  parallel  projects  loosely  coupled  sharing  a  common 
architecture  such  CORBA  or  DCOM.  This  paradigm  enables  the  possibility  of  managing 
large  developing  organizations  as  if  they  were  small.  In  such  scenarios,  the  benefits  of 
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complex  adaptive  systems  will  occur  at  two  levels.  At  the  micro  level,  inside  each  small 
project,  the  agents  are  individuals.  Second,  at  the  macro  level  where  the  agents  are  the 
small  projects. 


6.  Conclusion 

Complex  adaptive  systems  appear  as  the  most  attractive  way  to  deal  with 
changing  environments.  Besides  some  indicators  introduced  by  (Brovra  and  Eisenhardt, 
1999),  the  academic  research  is  not  mature  enough  to  assert  a  methodology  for 
competition  on  the  edge.  Some  enterprises  like  Microsoft  and  Intel  seem  to  have 
discovered  and  applied  this  form  of  strategy  since  many  years  ago,  but  little  information 
has  permeated.  The  drastic  change  proposed  in  the  software  processes  aims  to  use  the 
benefits  of  programming  in  the  small  to  programming  in  the  large.  The  quality-driven 
paradigm  should  be  revised,  and  that  the  objective  should  be  shorter  delivery  times, 
flexibility,  and  scalability. 
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III.  CONCEPTUAL  FRAMEWORK 


This  chapter  contains  the  framework  for  risk  identification  and  risk  assessment. 
Causal  analysis  was  used  to  find  the  primitive  origins  of  threats  in  a  project,  trying  to  find 
a  way  to  identify  and  assess  risk  automatically.  From  the  point  of  view  of  software 
engineering,  it  is  necessary  to  create  the  methodology  to  frame  the  decision-making 
process  during  the  early  stages  of  the  life  cycle,  when  changes  can  be  done  with  less 
impact  on  the  budget  and  schedule.  According  to  (Field,  1997),  the  most  significant 
causes  of  IS  project  failures  are:  lack  of  understanding  of  user's  needs,  ill  defined  scopes, 
poor  management  of  project  changes,  changes  in  the  chosen  technology,  changes  in  the 
business  needs,  unrealistic  deadlines,  user's  resistance,  loss  of  sponsorship,  lack  of 
personnel  skills,  and  poor  management. 

Risk  management  can  be  divided  in  three  activities:  risk  identification,  risk 
assessment  and  risk  resolution.  Risk  identification  is  the  set  of  techniques  designed  to 
alert  and  identity  possible  threats.  Risk  assessment  is  the  quantitative  analysis  of  the 
probabilities  and  impacts  of  the  identified  threats.  Risk  resolution  is  the  application  of 
resources  and  effort  to  avoid,  transfer,  prevent,  mitigate  or  assume  the  risks.  This  third 
activity  is  beyond  the  scope  of  this  research. 

In  order  to  achieve  risk  management,  an  organization  requires  a  minimum  level  of 
maturity  that  can  be  associated  to  Capability  Maturity  Model  (CMM)  level  2.  SEI 
followers  said  that  "many  organizations  are  unable  to  manage  risks  effectively  for  any  of 
the  three  following  reasons:  a  risk-averse  culture;  an  inadequate  management 
infrastructure  to  support  effective  risk  management;  or  the  lack  of  a  systematic  and 
repeatable  method  to  identify,  analyze,  and  plan  risk  mitigation"  (Carr,  1997).  If  an 
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organization  is  not  able  to  collect  metrics,  any  attempt  to  formally  identify  and  assess 
risks  is  impossible.  Project  managers  require  critical  information  to  make  timely  and 
prudent  decisions.  It  is  not  surprising  that  increased  complexity  can  decrease  a  project 
manager's  ability  to  identify  and  manage  risk. 

In  this  research  vision,  software  risks  could  be  controlled  if  the  problems  of  how 
to  administer  uncertainty,  complexity  and  resources  are  solved.  Transforming  the 
unstructured  problem  of  risk  assessment  leads  to  a  formal  method  able  to  be  translated 
into  an  algorithm.  In  order  to  structure  the  problem,  project  risk  was  analyzed  and 
decomposed  into  simpler  parts.  Using  causal  analysis  three  major  risk  factors  were 
identified:  process  risk,  resource  risk  and  product  risk.  Each  of  these  factors  introduces 
risks  by  themselves  but  mainly  due  to  the  interaction  between  them. 

Resource  risk  is  affected  by  organizational,  operational,  managerial  and 
contractual  parameters  such  as  resources,  outsourcing,  personnel,  time  and  budget  among 
others.  The  literature  is  abundant  in  this  area  (Hall,  1997),  (Karolak,  1996),  (Grey,  1995). 
Various  approaches  use  subjective  techniques  such  as  guidelines  and  checklists  (SEI, 
1996),  (Hall,  1997),  (Karolak,  1995),  which  even  when  could  be  supported  by  metrics, 
require  expert's  opinion. 

Engineering  development  work  procedures  such  as  software  development, 
planning,  quality  assurance,  and  configuration  management  cause  process  risk.  The  more 
complex  a  process  is,  the  more  difficult  it  is  to  manage,  and  the  more  education,  training, 
standards,  reviews,  and  communication  are  required.  Consequently,  complexity  grows. 
The  software  process  complexity  has  been  partially  covered  by  research  in  terms  of 
subjective  assessments  about  maturity  level  and  expertise  (SEI,  1996),  (Hall,  1998), 
(Humphrey,  1989).  However,  a  more  precise  and  objective  method  is  required.  Several 
approaches  to  study  process  complexity  have  been  introduced  in  the  field  of  management. 
Particularly,  (Nissen,  1998)  introduced  an  objective  methodology  that  can  be  used  to 
measure  the  complexity  of  processes  that  can  be  applied  to  software  development. 
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Cyclomatic  complexity  of  the  process  graph  is  another  candidate  metric.  These  two 
approaches  measure  complexity  in  a  static  way.  Simulation  can  be  used  to  measure  the 
complexity  of  the  dynamics  of  the  processes  (Abdel-Hamid,  1989  and  1991). 

Finally,  product  risk  is  related  to  the  final  characteristics  of  the  product,  its 
complexity,  its  conformance  with  specifications  and  requirements,  its  reliability,  and 
customer  satisfaction.  The  product  introduces  its  own  risk  terms  of  quantitative  and 
qualitative  attributes.  Two  basic  product-risk  factors,  requirement  stability  and 
requirement  complexity,  were  identified.  Requirement  stability  is  measurable  using  the 
set  of  metrics  previously  described.  Due  to  the  inherent  lack  of  structure  of  requirements, 
it  is  necessary  to  transform  them  into  specifications  in  order  to  compute  complexity. 
Other  product  characteristics  such  as  reliability  and  maintainability  are  not  of  interest  to 
identify  and  assess  risk  on  early  stages.  Reliability  can  be  measured  only  after  completion 
or  almost  completion.  Maintainability  can  be  measured  only  after  the  design  stated.  Both 
measures  are  useful  to  control  the  project  in  future  phases.  For  instance,  applying 
Schneidewind's  model  it  is  possible  to  monitor  the  occurrence  of  software  errors  as  a 
predictor  for  future  cumulative  detected  and  corrected  errors.  These  estimations  are  useful 
in  order  to:  (1)  identify  the  trade-off  function  between  error  reduction  and  cost  of  error 
reduction,  (2)  provide  quantitative  basis  for  accepting  or  rejecting  software  during 
functional  testing,  and  (3)  provide  quantitative  basis  for  deciding  whether  additional 
testing  is  warranted  based  on  the  cost  of  error  removal.  Maintainability  can  be  measured 
using  metrics  such  as  introduced  in  (Lorenz,  1995). 

The  analysis  showed  a  dependency  between  these  classes  of  risk.  The  success  of 
the  project  depends  on  its  own  characteristics  and  in  the  success  of  the  product  and  the 
process.  The  success  of  the  process  depends  on  itself  as  well  as  the  successful  use  of 
resources  in  the  project,  and  the  success  of  the  product.  And  the  success  of  the  product 
depends  on  itself,  but  depends  on  the  success  of  the  resources  and  the  process.  The  three 
areas  constitute  an  equivalence  relation  (Fig.  3.1)  because  the  symmetric,  transitive  and 
reflexive  properties  apply.  Moreover,  the  three  classes  are  one  equivalence  class  in  the 
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relation.  The  strong  dependency  between  the  three  concepts  reflects  the  fact  that 
resources,  process  and  product  are  different  facets  of  a  same  entity:  the  project. 

The  process  provides  the 
description  of  its  environment 
and  the  theoretical  requirements 
to  execute  it.  Consequently,  the 
process  introduces  threats  due  to 
its  requirements  and 
characteristics:  complexity, 

technology  required,  budget 
required,  schedule  required,  and 
persoimel  skills  required.  The 
resources  represent  the  actual  Figure  3.1:  The  Equivalence  Relation 

allowances  in  personnel,  tools, 

budget  and  schedule.  They  impose  constraints  that  could  not  match  the  process 
requirements.  The  productivity  is  consequence  of  the  matching  of  these  two  facets  of  the 
project. 

The  decomposition  created  by  causal  analysis  (Fig.  3.2)  reveals: 

•  A  method  to  identify  risks  by  comparing  the  degree  of  mismatching  between 
the  product  and  process  characteristics,  against  the  resource  constraints. 

•  Candidate  indicators  to  be  used  in  the  estimation  model.  In  Chapter  V,  three 
groups  of  metrics  will  be  introduced:  a)  for  requirements,  b)  for  persormel — 
the  key  resource — ,  and  c)  for  complexity.  These  three  groups  of  metrics 
correspond  to  the  three  risk  factors  identified  by  causal  analysis. 
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Constraints 


Figure  3.2:  Causal  Analysis  Fishbone  Diagram 
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IV.  RESEARCH  DESIGN 


As  outlined  in  the  introduction,  this  research  is  focused  primarily  upon  risk 
assessment  for  software  engineering.  More  precisely,  it  addressed  the  issue  of  human 
dependency  in  risk  assessment  of  the  evolutionary  software  processes  incorporating  an 
automated  risk  assessment  method.  Despite  the  improvements  achieved  in  software 
processes,  software  reuse  and  automated  tools,  risk  assessment  for  software  projects 
remains  as  an  unstructured  problem  dependent  on  human  expertise.  It  is  the  intention  of 
this  research  to  find  ways  to  transform  risk  assessment  into  a  structured  problem.  Solving 
the  risk  assessment  problem  with  indicators  measured  on  the  early  phases  constitute  the 
main  contribution  of  this  research. 

The  problem  of  productivity  is  partially  solved.  The  industry  has  enough  tools 
that  improve  software  development  productivity.  New  efforts  on  this  vein  are  not  the 
solution  for  the  software  crisis  because  the  problem  in  the  author's  opinion  is  focused  on 
organizational  and  human  communication  issues.  Software  development  is  still  a  human 
dependent  activity  requiring  lot  of  human  communication,  and  without  appropriate 
managerial  decision  support  tools,  software  engineering  will  remain  in  its  present  state.  A 
better  understanding  of  the  knowledge  about  the  internal  phenomenology  of  the  software 
life  cycle  is  required  to  improve  software  development  because  it  is  in  the  human  aspects 
of  the  software  process  where  the  bottleneck  is  located  now.  Without  such  knowledge, 
risk  assessment  is  almost  impossible. 

The  primary  research  question  is:  What  are  the  early  automatically  collectable 
measures  from  the  software  process  that  describe  project  risk?  The  risk  of  the  project  is 
related  to  its  probability  of  success.  That  is  the  probability  of  reaching  the  objective  with 
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the  assigned  resources  in  the  allocated  time.  The  main  point  in  the  question  is  the 
discovery  of  a  set  of  good  indicators  for  risk.  These  indicators  should  be  recognized 
during  the  early  phases  of  the  process  in  order  to  provide  early  alert.  To  answer  the 
research  question  a  literature  revision  was  conducted  covering  different  fields; 

1 .  Review  the  papers  and  books  about  software  evolution.  This  study  helped 
to  understand  the  scope  and  limitations  of  such  software  processes,  and  it 
helped  to  discover  the  problem. 

2.  Review  of  risk  management  from  the  operational  research  point  of  view. 
This  study  provided  theoretical  background  to  produce  a  mathematical 
model. 

3.  Review  of  the  literature  about  risk  management  in  the  field  of  software 
engineering.  This  study  showed  two  well-defined  groups  of  researchers. 
The  first  group  follows  a  less  rigorous  and  human  dependant  approach 
starting  from  the  beginning  of  the  project.  This  study  revealed  that  this 
research  was  original.  The  second  group  corresponds  to  the  software 
reliability  field  and  follows  a  rigorous  approach  post  mortem.  This  set  of 
research  provided  insights  of  how  to  link  the  operational  research 
approaches  with  the  software  engineering  approaches. 

4.  Causal  analysis  was  employed  to  find  a  set  of  candidate  indicators  for  risk. 
The  set  of  candidate  indicators  was  compared  with  previous  research.  It 
was  found  that  requirement  variability,  personnel  turnover  and  complexity 
were  promising  indicators. 

5.  Review  of  the  software  economics  research,  specially  COCOMO  and 
Putnam's  models.  This  study  showed  that  the  estimation  models  available 
today  have  some  limitations  when  applied  to  evolutionary  software 
processes. 

6.  Experiments  to  prove  the  correlation  between  complexity  and  size  were 
conducted  using  the  available  baselines  of  projects  created  by  the 
evolutionary  software  process,  specifically  using  CAPS. 
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The  second  research  question  is:  How  can  these  measures  be  related  in  order  to 
assess  project  risk?  Answering  this  question  implies  the  formalization  of  a  model  and  its 
calibration  and  validation  in  three  ways:  a)  internal  consistency  proved  by  mathematics 
and  statistics;  b)  black  box  validation  by  comparing  its  outputs  in  duration  and  effort  with 
other  available  models;  and  c)  black  box  validation  against  a  set  of  observations.  To 
achieve  this  last  goal  a  large  set  of  well-measured  software  projects  is  required.  This  set 
could  not  be  found.  A  second  and  more  promising  alternative  was  to  simulate  a  set  of 
projects.  ViteProject  was  chosen  for  the  following  reasons: 

•  Availability 

•  Possibility  of  customizing 

•  Includes  the  model  for  communications  and  exceptions 

•  Given  that  the  proposed  model  uses  parameters  collected  during  the  early 
phases  and  given  that  ViteProject  requires  a  complete  breakdown  structure  of 
the  project  -that  can  be  done  only  in  the  late  phases-  there  exists  a 
considerable  time  gap  between  the  two  measurements.  Such  time  gap  is  less 
than  conducting  a  post  mortem  analysis,  but  is  enough  for  calibration  and 
validation  purposes. 

However,  the  simulation  tool  is  not  configured  for  software  projects.  To  solve  this 
problem  it  was  necessary  to  review  organizational  theory  and  use  an  expert  system 
(Organizational  Consultant)  to  obtain  the  correct  parameters  (see  Appendix  B).  The 
research  ends  proposing  an  extension  to  the  latest  version  of  the  graph  model,  namely 
relational  hypergraph  model,  to  support  automated  risk  assessment. 
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V.  DEVELOPMENT  OF  THE  MODEL 


This  chapter  applies  the  framework  described  in  Chapter  III,  to  develop  a  model 
for  risk  identification  and  risk  assessment.  First,  some  concepts  about  software  metrics 
will  be  discussed,  presenting  a  small  set  of  metrics  in  which  the  model  will  be  based. 
Next,  the  model  will  be  discussed  with  its  variables  and  their  relationships. 

A.  SOFTWARE  METRICS 

Metrics  are  a  key  factor  in  the  identification  of  threats.  Without  metrics  it  is  not 
possible  to  provide  early  alerts  of  risks.  There  are  some  erroneous  perceptions  about 
metrics  that  it  is  necessary  to  clarify: 

•  "Metrics  act  against  the  creative  process."  This  is  an  excuse  to  avoid  the  use  of 
metrics.  Metrics  should  be  collected  without  the  direct  intervention  of 
humans.  The  collection  process  should  be  transparent  to  designers. 

•  "Metrics  represent  additional  work  load."  The  collection  procedure  can  be 
automated,  so  the  extra  workload  is  not  significant.  The  analysis  of  the  metrics 
requires  the  attention  of  the  project  manager,  and  this  is  his  normal  work. 

•  "The  benefits  of  metrics  are  unclear."  This  myth  is  really  irrational.  Without 
measures  over  the  process  it  is  impossible  to  assess  how  much  effort  is 
required,  or  what  are  the  risks  that  should  be  mitigated. 

•  "People  are  afraid  of  metrics."  That  is  true,  and  it  is  very  common  to  find 
some  resistance  to  the  introduction  of  a  metrics  plan.  It  is  important  to  use  the 
metrics  to  measure  the  process  rather  than  use  them  to  punish  low 
productivity. 
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This  section  describes  a  set  of  metrics  that  support  the  risk  identification  strategy. 
All  the  metrics  presented  here  are  well  formed,  in  the  sense  that  they  present  the 
following  strengths: 

•  Robustness.  Capacity  of  being  tolerant  to  variability  of  the  inputs. 

•  Repeatable.  Different  observers  would  arrive  to  the  same  measurement  no 
matter  how  many  repetitions. 

•  Simple.  Using  the  least  number  of  parameters  sufficient  to  obtain  an  accurate 
measurement. 

•  Easy  to  calculate.  They  do  not  require  complex  algorithms  or  processes. 

•  Automatically  collected.  There  is  no  need  of  human  intervention. 

The  minimal  set  of  metrics  to  support  the  risk  assessment  model  cover  three  areas: 
a)  requirements,  b)  personnel  (the  key  resource),  and  c)  complexity.  These  three  groups 
of  metrics  correspond  to  the  three  risk  factors  that  identified  by  causal  analysis,  described 
in  Section  3.1. 

1.  Metrics  for  Requirements 

a.  Birth-rate  (BR) 

Birth-rate  is  defined  as  the  percentage  of  new  requirements  incorporated  in 
each  cycle  of  the  evolution  process.  This  metric  shows  the  explosion  of  new  requirements 
as  a  percentage. 

BR  =  (NR/TR)*  100  (%) 

where  NR  =  number  of  new  requirements 

TR  =  total  number  of  requirements  =  PR  +  NR 
PR  =  previous  requirements 

b.  Death-rate  (DR) 

Death-rate  is  defined  as  the  percentage  of  requirements  that  are  dropped  by 
the  customer  in  each  cycle  of  the  evolution  process. 
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DR  =  (DelR/TR)*  100  (%) 

where  DelR  =  number  of  requirements  deleted 

TR  =  total  number  of  requirements  (before  deletion)  =  PR  +  NR 

c.  Change-rate  (CR) 

Change-rate  is  defined  as  the  percentage  of  requirements  changed  from  the 
previous  version. 

CR  =  (ModR  /  TR)  *  1 00  (%) 

where  ModR  =  number  of  requirements  changed 
TR  =  total  number  of  requirements 

From  the  point  of  view  of  the  metrics,  a  change  on  a  requirement  can  be 
viewed  as  a  death  of  the  old  version  and  a  birth  of  the  new  one.  This  simplification  does 
not  imply  losses  of  information  about  the  history  of  the  evolution.  The  traceability  of  the 
evolution  remains  in  the  hypergraph  model. 

The  simplification  just  described,  enables 
one  to  compare  birth-rate  and  death-rate  in 
a  bi-dimensional  plot  that  shows  four 
regions:  stability  region,  growing  region, 
volatility  region  and  shrinking  region  (Fig. 

5.1).  The  graph  is  double  logarithmic,  so 
the  borders  of  the  four  regions  are  in  the 

10%  value.  Each  of  these  regions  has 
different  risk  connotations.  The  arrow 
shows  the  normal  evolution  of  the  project  as  the  time  goes  by.  During  early  stages,  it  is 
normal  for  projects  being  in  the  growing  region.  However,  if  the  project  continues  in  this 
region  after  many  cycles,  or  return  to  this  region  after  visiting  other  regions,  something 
wrong  happens.  In  the  first  case,  this  is  an  indicator  that  the  requirement  engineering  is 
not  efficient;  hence  some  corrective  action  should  be  applied.  In  the  second  case,  shows 
evidence  of  late  discovery  of  some  cluster  of  hidden  requirements. 


Figure  5.1:  Evolution  of  Requirements 
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After  some  cycles,  the  project  should  be  in  the  volatile  region.  If  the 
project  does  not  evolve  through  the  stability  region,  then  there  is  evidence  that  the 
requirements  engineering  activity  is  not  being  efficient  and  some  corrective  action  is 
mandatory.  It  is  important  to  analyze  the  evolution  of  the  stakeholders'  issues  and 
criticisms.  It  could  be  also  the  case  that  stakeholders  have  changed  their  minds.  If  the 
project  evolves  to  the  shrinking  region,  and  the  requirements  engineering  is  working 
right,  there  is  evidence  that  the  customers  are  cutting  down  the  project.  This  can  be  the 
indicator  of  a  severe  cut  in  the  budget.  Finally,  any  involution  to  a  previous  region  should 
be  considered  as  evidence  of  threats.  In  such  cases  a  detailed  analysis  is  required  to  assess 
the  causes  of  the  anomaly.  This  set  of  metrics  can  be  collected  automatically  form  the 
hypergraph  and  can  give  early  alerts  of  the  threats. 


2.  Metrics  for  Personnel 

It  is  necessary  to  measure  the  fit  between  people  and  their  roles  in  the  software 
process.  In  order  to  measure  personnel  both  quantitative  and  qualitative  metrics  are 
required.  A  skill  match  between  person  and  job  is  required  to  estimate  the  speed  in 
processing  information  and  rate  of  exceptions.  On  the  quantitative  side  it  is  important  to 
measure  the  number  of  people  and  the  turnover.  The  latter  provides  information  about  the 
expected  productivity  losses  due  to  training,  learning  curves  and  communications.  This 
set  of  metrics  is  difficult  to  collect  because  people  are  very  reluctant  to  being  measured. 
The  simulations  showed  that  there  exists  an  easier  way  to  measure  the  productivity  fitness 
observing  the  ratio  between  direct  working  time  and  idle.  Fitness  is  related  to  two  risk 
factors:  the  resources  and  the  process. 


3.  Metrics  for  Complexity 

In  general,  the  complexity  of  an  object  is  a  function  of  the  relationships  among  the 
components  of  the  object.  In  an  early  vision  of  modem  object  oriented  paradigm,  (Myers, 
1976)  introduced  three  valuable  concepts  to  measure  complexity: 
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•  Independence:  The  independence  of  each  component  can  reduce  the 
complexity  of  the  system  if  the  components  are  a  partition  of  the  system.  So, 
there  is  maximum  cohesion  and  minimum  coupling. 

•  Hierarchy:  Hierarchical  structures  allow  the  stratification  of  the  system  in 
different  layers  of  abstraction. 

•  Explicit  communication:  The  components  should  communicate  with  explicit 
protocols  avoiding  any  hidden  side  effects. 

Complexity  has  a  direct  impact  on  quality  because  the  likelihood  that  a 
component  fails  is  directly  related  to  its  complexity.  The  quality  of  the  product  can  only 
be  determined  at  the  end  of  the  process.  Hence,  it  is  important  to  measure  the  complexity 
as  predictor  (Munson,  1995).  Real  time  systems  present  special  difficulties  in  terms  of 
requirement  engineering.  Some  requirements  are  difficult  for  the  user  to  provide  and  for 
the  analysts  difficult  to  determine.  The  best  way  to  discover  these  hidden  requirements  is 
via  prototyping.  CAPS  is  a  CASE  tool  specially  suited  for  this  task.  It  has  a  graphical 
easy  to  imderstand  interface  and  mapped  to  a  specification  language,  which  in  turns 
generates  Ada  code.  The  main  components  of  CAPS  are: 

•  The  prototype  system  description  language  (PSDL).  PSDL  is  based  on  data 
flow  under  real-time  constraints.  It  uses  an  enhanced  data  flow  diagram  that 
includes  non-procedural  control  and  timing  constraints. 

•  User  interface  based  on  a  graphic  editor  with  a  palette  of  objects  that  include 
operators,  inputs,  outputs,  data  flows  and  operator  loops.  A  browser  helps  the 
designer  to  find  reusable  components.  An  expert  system  provides  the 
capability  to  generate  English  descriptions  of  PSDL  specifications. 

•  The  software  database  system  provides  the  repository  for  reusable  PSDL 
components. 

•  The  execution  support  system  consists  of  a  translator,  scheduling  mechanisms 
and  a  debugger. 
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The  prototyping  process  consists  of  prototype  construction  and  modification 
(evolution)  based  on  evolving  requirements  and  code  generation.  Both  construction  and 
modification  are  exploratory  activities  with  a  common  target:  to  satisfy  multiple  users 
with  different  and  often  conflicting  points  of  view.  Requirement  engineering  is  a 
consensus  driven  activity  in  which  mechanisms  for  conflict  resolution  and  traceability  of 
requirement  evolution  represent  critical  success  factors. 

The  specifications  written  in  PSDL  are  suitable  of  being  analyzed  to  compute 
their  complexity.  In  PSDL  code  has  the  following  tokens:  types,  operators,  data  streams 
and  constraints.  Types  are  declarations  of  abstract  data  types  required  for  the  system. 
Operators  and  data  streams  are  the  components  of  a  dataflow  graph.  Finally,  constraints 
represent  the  real-time  constraints  that  the  system  must  support. 

Two  complexity  metrics  were  defined  for  PSDL:  Fine  Granularity  Complexity 
metric  (FGC),  and  Large  Granularity  Complexity  metric  (LGC).  The  reason  to 
compute  different  metrics  is  because  they  are  indicators  of  two  classes  of  threats.  First,  it 
is  necessary  to  be  aware  of  operators  too  complex.  High  complexity  on  one  operator 
could  be  caused  by  poor  design  and  possible  can  be  solved  by  further  decomposition. 
Second,  it  is  necessary  to  have  a  metric  to  compute  the  total  complexity  of  the  system. 

FGC  expresses  the  complexity  of  each  operator  in  the  system  and  is  a  function  of 
the  fan-in  and  fan-out  data  streams  related  to  the  operator. 

FGC  =  fan-in  +  fan-out 

LGC  expresses  the  complexity  of  the  system  as  a  function  of  the  number  of 
operators,  data  streams,  and  types. 

LGC  =  O  +  D  +  T 
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To  analyze  PSDL  code  it  was  necessary  to  develop  a  tool  to  compute  the  LGC 
and  FGC.  In  Figure  5.2  LGC  in  presented  under  the  title  of  "Complexity"  and  FGC  is 
presented  under  the  title  "Fan-In+Fan-Out". 


Figure  5.3:  Correlation  between  PSDL  and  LGC 


Figure  5.4:  Correlation  between  Ada  code  and  LGC 


The  comparison  between  Ada  non-comment  lines  of  code  of  the  projects  with 
their  complexity  measured  using  LGC  shows  a  strong  correlation  also  (R  =  0.898).  The 
complexity  metric  correlates  better  with  PSDL  than  with  Ada.  The  reason  for  this 
difference  is  because  CAPS  automatically  generate  PSDL.  On  the  other  hand,  even  if 
CAPS  generates  part  of  the  Ada  code,  the  designer  can  add  and  modify  the  generated 
code  introducing  more  variability.  Figure  5.4  shows  the  correlation  observed  for  the  same 
set  of  projects. 
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A  caveat  of  this  study  is  that  the  sample  is  small,  but  it  includes  all  the  available 
information  at  the  current  time.  However,  the  study  suggests  the  possibility  of  estimating 
size  in  terms  of  complexity  with  a  useful  degree  of  accuracy. 


B.  ESTIMATION  METHODS 

Software  projects  could  be  considered  as  experiments  where  their  cost  and 
schedule  are  the  output  measures.  It  is  well  known  that  software  projects  tend  to  overrun 
costs  and  schedule  (this  fact  has  been  proved  by  research  and  industry)  (Boehm,  1981), 
(Putnam,  1997),  (Jones,  1996).  There  are  two  possible  ways  to  interpret  the  result  of  the 
experiment.  One  hypothesis  is  that  this  behavior  is  abnormal,  and  consequence  of  lack  of 
process  maturity  (SEI/CMM  approach).  Another  hypothesis  is  that  this  could  be  a  "false- 
abnormal"  behavior  assumed  abnormal  as  consequence  of  inappropriate  measurements. 

The  industry  has  been  using  three  classes  of  tools  to  estimate  effort  and  time  that 
can  be  applied  at  different  moments  during  the  life  cycle,  each  category  being  more 
precise  than  the  previous  one  but  arriving  later; 

•  Very  early  estimations.  This  category  includes  very  crude  approximations 
done  during  the  beginning  of  the  process  usually  by  subjective  comparisons 
using  previous  projects. 

•  Macro  models.  This  category  includes  Basic  COCOMO,  Putnam,  Function 
Points,  etc.  The  estimation  is  done  after  completing  the  requirements  phase. 

•  Micro  models.  This  category  includes  intermediate  and  detailed  COCOMO, 
and  Pert/CPM/Gantt  techniques.  The  estimation  is  done  after  the  design  when 
it  is  possible  to  have  a  work  breakdown  structure.  The  estimation  is  the 
integration  of  all  module  estimations. 

None  of  these  techniques  consider  the  following  characteristics  of  software 
projects:  a)  requirements  stability,  b)  personnel  stability,  and  c)  time  consumed  by 
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communications,  exceptions  and  noise  in  the  process.  All  the  methods  use  size  as  input 
parameter  as  some  kind  of  derivation  from  complexity.  In  many  cases  the  methods  to 
compute  such  complexities  and  sizes  are  questionable.  Recently,  Stanford  University 
(Levitt,  1999)  developed  a  new  generation  micro  model  estimation  tool  (ViteProject)  that 
addresses  some  of  the  previous  concerns.  However,  this  tool  is  useful  to  control  the 
project  but  its  results  arrive  too  late  for  early  estimation. 

How  to  create  a  macro  model  that  considers  the  previous  concerns  and  is  able  to 
be  used  during  the  early  stages  of  the  process?  Probabilities  can  be  applied.  In  1939  the 
Swedish  physicist  Waloddi  Weibull  introduced  a  heavy-tailed  probability  distribution  to 
represent  the  distribution  of  the  breaking  strength  of  materials  (Devore,  1995).  There  is 
some  controversy  about  who  was  the  first  scientist  that  introduced  this  distribution.  There 
is  a  previous  study  of  1933  describing  the  "laws  governing  the  fineness  of  powdered 
coal"  that  used  a  similar  function  (Johnson94).  Weibull  distribution  is  also  known  as 
Weibull-Gnedenko  in  the  Russian  literature,  and  as  Frechet  for  an  earlier  paper  presented 
in  Poland  in  1927. 

Weibull  used  this  distribution  to  model  strength  of  Bofors’s  steel,  fiber  strength  of 
Indian  cotton,  length  of  syrtoideas,  fatigue  life  of  steel,  statures  of  adults  males,  and 
breadth  of  beans.  The  Weibull  distribution  includes  the  exponential  and  the  Rayleigh  as 
special  cases.  It  has  been  used  to  model  different  failure  rates:  a)  decreasing  (when  the 
shape  parameter  a  <  1),  b)  constant  (when  a  =  1  —the  exponential  case  with  X  =  1/p— ), 
and  c)  increasing  (when  a  >  1).  Many  authors  (Johnson94,  Devore95,  Lyu95)  advocated 
the  use  of  this  distribution  in  reliability  and  quality  control.  Others  like  Putnam  and 
Norden  used  it  to  model  software  life  cycles.  These  previous  works  cited  in  Chapter  II 
motivated  the  interest  in  this  distribution. 

In  some  literature  (Devore,  1995)  and  software  (Excel),  the  distribution  function 
is  presented  with  two  parameters:  a  (the  shape  parameter),  and  P  (the  scale  parameter  that 
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can  compress  or  elongate  the  curve  in  the  x  axis).  However,  Weibull  in  his  original  work 
mentioned  a  third  parameter,  y,  to  shift  the  curve  to  the  right. 

A  random  variable  x  is  said  to  have  a  Weibull  distribution  with  parameters  a  and 
P  (with  a  >  0,  p  >  0)  if  the  probability  distribution  function  (pdf)  and  cumulative 
distribution  function  (cdf)  of  x  are  respectively: 

f  0,  X  <  0 

pdf:  f(x;  a,  p)  =  *! 

[  (cc/p“)  x“''  exp(-(x/p)“),  X  >  0 

f  0,  X  <  0 

cdf:  F(x;  a,  P)  =  <! 

I  1  -  exp(-(x  /  P)  “),  X  >  0 

Lets  discuss  the  meaning  of  each  of  the  variables  in  the  function: 

a)  X  is  the  random  variable  imder  study.  In  this  case,  x  can  be  interpreted  as 
development  time. 

b)  a  is  a  shape  parameter.  It  reduces  the  variability  narrowing  the  shape  of  the 
pdf. 

c)  P  is  a  scale  parameter  that  stretches  or  compresses  the  graph  in  the  x  direction. 

d)  Note  that  the  functions  start  at  x  =  0.  A  third  parameter  is  required  to  shift  the 
curves  to  the  right.  For  that  reason  was  introduced  a  location  parameter  y, 
which  is  function  of  the  system  complexity.  The  new  functions  are  then: 


f  0, 

x<y 

pdf:  f(x;  y,  a,  p)  =  i 

1  (a/p“)  (x  -y)“-'  exp(-((x  -  y)/p)“). 

X  >y 

(Eq.  1) 

f  0, 

X  <y 

cdf:  F(x;  y,  a,  p)  =  i 

[  1  -  exp((-(x  -  y)  /  p)  “), 

X  >y 

(Eq.2) 
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C.  CONSTRUCTION  OF  THE  MODEL  AND  SIMULATIONS 

1.  Finding  the  Complexity  Metric  and  its  Conversion  to  KLOC 

One  of  the  goals  of  this  research  was  to  provide  a  way  to  assess  the  duration  of  the 
project  given  some  indicators  collected  during  the  requirements  phase.  In  such 
conditions,  code  is  not  available,  so  the  only  possible  measurements  should  come  from 
the  specification. 

Research  on  Ftmction  Points  (FP)  (Albrecht  1979,  1983)  showed  that  there  exists 
a  clear  relation  between  complexity  and  size  in  terms  of  lines  of  code.  However,  FP  is  not 
well  suited  for  real  time  systems  or  object-oriented  developments.  The  reason  is  that 
parameters  used  in  FP  are  not  representative  of  the  complexity  in  such  systems.  Chapter 
II  discussed  in  detail  this  issue.  Consequently,  it  was  necessary  to  look  for  another  way  to 
measure  complexity.  The  observed  properties  on  PSDL  showed  characteristics  that  could 
be  used  to  find  the  way  to  calculate  complexity.  In  order  to  measure  the  complexity  of  a 
module,  the  count  of  the  fan-in  and  fan-out  is  a  good  estimator.  This  metric  was  called 
Fine  Granularity  Complexity  (FGC).  In  order  to  find  the  complexity  of  the  whole  system, 
the  count  of  PSDL  operators  (bubbles),  data  streams  (arrows),  and  types  is  a  good 
estimator.  This  metric  was  called  Large  Granularity  Complexity  (LGC). 


78 


The  observations  showed  a  strong  linear  correlation  between  LGC  and  size  of  the 
specification.  More  interesting  was  the  finding  of  a  strong  (but  lower)  correlation 
between  LGC  and  the  size  of  the  projects  in  Ada  non-comment-lines  of  code.  The  size  of 
the  project  in  thousands  of  non-comment  lines  of  code  can  be  estimated  as: 

KLOC  -  (32  LGC  +  1 50)  / 1 000  (Eq.  3) 

As  the  complexity  grows,  the  ratio  trends  to  approximately  32  LOC  for  each  unit 
of  LGC.  This  finding  provided  us  with  a  method  to  compute  the  size  of  the  projects  given 
an  early  measure  of  their  complexity.  This  conversion  is  required  to  compare  how  close 
this  approach  is  with  respect  to  other  methods,  such  as  Putnam’s  and  Boehm’s,  that 
require  size  as  parameter. 


2.  Comparison  between  Putnam’s  and  Boehm’s  Estimations 

Before  trying  to  compare  this  estimation  model  with  the  industries  standards 
(Putnam  and  COCOMO),  an  experiment  was  conducted  to  compare  these  two  methods 
(see  Chapter  II).  In  the  experiment  used  Basic  COCOMO  because  it  is  the  only  one  that  is 
a  macro  model.  Intermediate  and  Detailed  COCOMO  require  a  micro  calibration  that 
carmot  be  done  xmtil  the  design  is  done.  The  purpose  was  to  analyze  early  estimations,  so 
Basic  COCOMO  was  the  choice.  For  the  comparison  Putnam’s  results  were  transformed 
fi-om  man-year  and  years  to  man-month  and  months. 

The  experiment  consisted  in  computing  Basic  COCOMO  and  Putnam  for 
fictitious  projects  from  10  to  1000  KLOC.  Basic  COCOMO  was  computed  for  organic, 
semidetached,  and  embedded  systems  to  discriminate  between  these  types  of  projects. 
The  results  showed  that  in  terms  of  effort,  Putnam’s  method  provides  an  estimation  that  is 
close  to  the  average  between  embedded  and  semidetached  basic  COCOMO.  In  terms  of 
development  time,  the  models  are  quite  similar,  Putnam’s  being  more  optimistic. 
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3.  Search  of  the  Relationship  between  Complexity  (LGC)  and 
Development  Time 

Having  found  a  complexity  metric  suited  for  this  research,  the  next  step  was  to 
find  the  existence  of  some  sort  of  relationship  between  LGC  and  development  time.  A 
simple  experiment  was  conducted  using  the  conversion  ratio  (Eq.  3)  to  obtain  the  size 
inputs  for  the  sample.  The  sample  points  were  from  1000  LGC  to  30000  LGC,  which 
means  sample  projects  from  32  KLOC  to  almost  IMLOC.  The  average  estimation  for  the 
development  time  using  COCOMO  and  Putnam  was  computed  fro  these  projects.  The 
sample  points  are  plotted  with  a  smoothing  thick  line  (Fig.  5.6).  The  logarithmic  trendline 
is  plotted  as  a  thin  red  line.  A  strong  logarithmic  correlation  (R^  =  0.9699)  with  the 
following  function  was  found  (Fig.  5.6); 

Time  (months)  =  1 2.968  Ln(LGC)  -  82.23  (Eq.  4) 


This  equation  gives  a 
conservative  estimation  for 
projects  between  4000  and  20000 
LGC  (128  and  640  KLOC  of 
Ada).  The  estimation  seems  to  be 
too  optimistic  for  projects  smaller 
than  2000  LGC  or  greater  than 
25000  LGC. 


Figure  5.6:  Correlation  between  Development 
Time  and  Complexity 


4.  Search  for  the  Relation  between  Efficiency  and  Development  Time 

Causal  analysis  showed  (Chapter  III)  that  the  risk  of  the  project  should  be 
dependent  on  three  factors:  complexity,  productivity  and  volatility  of  requirements.  The 
method  to  compute  complexity  and  the  equation  to  estimate  y,  the  estimated  development 
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time  (in  months),  based  on  complexity  were  discussed  in  the  previous  sections.  Literature 
in  productivity  classifies  time  spent  at  work  into  four  categories: 

•  Direct.  Time  spent  working  and  correcting  errors  on  the  product.  In 
ViteProject  terminology,  it  is  the  sum  of  work  and  rework. 

•  Indirect.  Time  spent  in  activities  supporting  the  work  such  as  meetings, 
coordination,  information  exchanges,  etc.  In  ViteProject  terminology,  it  is 
known  as  coordination  time. 

•  Idle.  Time  spent  without  work  to  do,  waiting  for  some  input.  In  ViteProject 
terminology,  it  is  known  as  waiting  time. 

•  Personal.  Time  spent  doing  anything  except  the  other  categories.  ViteProject 
does  not  compute  this  category  of  time.  However,  it  is  loosely  related  to  the 
noise  parameter  of  the  tool. 

Examining  the  time  distribution  of  these  categories  it  is  possible  to  observe  a 
remarkable  pattern  that  differentiates  high  efficiency  scenarios  from  the  low  efficiency 
ones.  This  effect  is  independent  of  the  other  two  variables  of  the  simulation.  Hence,  this 
suggests  that  the  time  distribution  can  be  a  good  indicator  for  the  efficiency  of  the 
organization  The  ratio  between  work  and  idle  time  can  be  automatically  captured  from 
the  software  evolution  steps  as  suggested  by  [Ham,  1999f].  Figure  5.7  presents  the 
distribution  times  for  the  eight  scenarios  simulated.  A  pattern  of  time  distributions  can  be 
clearly  observed.  Scenarios  with  low  efficiency  have  a  percentage  of  idle  time  greater 
than  13%  of  the  total  development  time.  The  following  characteristics  can  be  observed 
from  the  simulations: 

•  Direct  work  is  reduced  by  1 0%  when  efficiency  is  high. 

•  Indirect  work  is  reduced  by  40%  when  the  efficiency  is  high. 

•  Idle  time  is  reduced  by  70%  when  the  efficiency  is  high. 

Low  efficiency  scenarios  can  be  recognized  also  by  the  ratio  of  the  percentage  of 
direct  time  over  percentage  of  idle  time,  which  was  called  efficiency  ratio  (EF): 

EF  =  Direct%  /  Idle%  [Eq.  5] 
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2.0. 


For  high  efficiency  scenarios  2.0  <  EF,  and  for  low  efficient  scenarios  0.8  <  EF  < 


HHH 

LHH 

HHL 

1  '  .’  I  "  ■ 

LHL 

- 

HLH 

- 

. 'J . . -  . 

LLH 

HLL 

'  :..T  J  ; 
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0%  20%  40%  60%  80%  100% 

EJ  Direct  ■  Indirect  □  Idle 

Figure  5.7:  Patterns  of  Time  Distribution 


The  simulations  showed  that  for  high  efficient  scenarios  the  development  time 
was  60%  shorter  than  for  low'  efficient  ones.  The  reasons  why  the  ratio  EF  is  related  to 
productivity  require  further  study.  However,  it  is  possible  to  conjecture  that  the  reason 
could  be  related  to: 

•  Fit  of  job  and  people  skills. 

•  People  turnover,  generating  noise  and  productivity  losses  derived  from 
training  and  learning  curves. 

•  Number  of  people,  influencing  the  productivity  in  two  ways.  If  the  number  of 
people  is  less  than  the  roles  of  the  software  process,  then  the  productivity  will 
be  affected  because  someone  will  be  dividing  his  attention  and  effort  to  more 
than  one  role.  On  the  other  hand  if  the  number  of  people  exceeds  the  roles, 
then  the  productivity  will  be  affected  by  additional  communications. 
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5.  Search  for  the  Relation  between  Requirement’s  Volatility  and 
Development  Time 

The  requirements  volatility  is  obtained  by  the  follo\\ing  formula: 

RV  =  INT((BR  +  DR)  / 10)  (Eq.  5) 

For  instance  if  BR  =  20%  and  DR  =  10%  then  RV  =  3.  The  simulations  showed  a 
20%  increase  on  the  development  time  when  the  requirement’s  volatility  is  high 
(Appendix  C). 


6.  Calibration  of  the  Parameters 

To  calibrate  the  values  of  the  parameters  described  previously,  a  set  of 
simulations  with  ViteProject  was  conducted  keeping  the  values  of  tw-o  variables  constant 
and  changing  the  third  one  from  low  to  high.  The  reason  to  do  so,  is  to  isolate  the  effects 
of  each  variable.  Having  three  variables  and  using  two  possible  values  for  each  one,  the 
universe  of  scenarios  is  reduced  to  the  eight  (2^  scenarios  showed  in  Table  5.1: 

Table  5.1:  Simulated  Scenarios 


|i|Sc«?af^,na|rie v,- 

ProductivUy 

Req.  mlatilify  - 

Complexity 

pLL ; 

■■  ■ 

Low 

'  low 

Low 

|""’^Lbw  ';^  155: 

'  K  -  '  ;  '"■■A. 

"TSh  " . 
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High 

'Low;5  'v:|v. 

High 

fT''  "■'Hli"""  ’T" 

. HLH  . . 

High  ,  ;  , 

"High 

Low  V  .  " 

Low 

^jriowmT 

High 

:  flHL  _ , 

. ;  ■■ 

'‘T,;"  '"'Xbw',  ,’^r 

hhh“' 

"High 

High 

Each  scenario  name  consists  of  three  letters  that  correspond  to  the  value  of  each  of 
the  three  concepts  under  study:  efficiency  ratio  (EF),  requirements'  volatility  (RV),  and 
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complexity  (CX).  Each  letter  could  have  two  values:  high  (H)  or  low  (L).  The  simulation 
tool  was  configured  to  run  100  simulations  for  each  scenario,  and  the  organizational 
parameters  were  set  to  match  the  characteristics  of  software  development. 

The  simulation  reports  can  be  found  in  Appendix  C.  Table  5.2  contains  the 
configuration  used  in  the  simulation.  These  values  are  consequence  of  an  analysis 
realized  with  an  expert  system  called  Organizational  Consultant  (Burton,  1998).  The  tool 
provides  assistance  to  establish  characteristics  of  organizations.  The  characteristics  of  two 
Active  software  development  organizations  were  introduced  in  the  simulations:  low 
efficiency  (associated  to  CMM  level  1  or  2),  and  high  efficiency  (CMM  level  3  or  more). 
The  reports  of  the  expert  system  are  presented  in  Appendix  B. 

To  analyze  the  effect  of  efficiency,  the  results  of  the  simulations  of  the  following 
scenarios  were  compared:  LLL  vs  HLL,  LLH  vs  HLH,  LHL  vs  HHL,  and  LHH  vs  HHH. 
It  was  found  that  for  high  productivity  scenarios  (Hxx)  the  development  time  improved  in 
a  60%. 


To  analyze  the  effect  of  requirement  volatility,  the  results  of  the  simulations  of  the 
following  scenarios  were  compared:  LLL  vs  LHL,  LLH  vs  LHH,  HLL  vs  HHL,  and  HLH 
vs  HHH.  It  was  found  that  high  requirement  volatility  (xHx)  degraded  the  development 
time  in  20  to  30%. 

To  analyze  the  effect  of  complexity,  the  results  of  the  simulations  of  the  following 
scenarios  were  compared:  LLL  vs  LLH,  LHL  vs  LHH,  HLL  vs  HLH,  and  HHL  vs  HHH. 
It  was  found  that  high  complexity  (xxH)  degrade  the  development  time  in  20  to  30%. 
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Table  5.2:  Configuration  Parameters  for  Vit^Project 
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generated  project  errors,  shown  in  the  Simulator  Analysis 
Summary. 

Information  exchange  (0.8  high) 


D.  THE  MODELS 


Three  models  were  created  with  an  increasing  degree  of  accuracy.  These  models 
are  based  on: 

•  Metrics  from  the  three  risk  factors 

•  Weibull  cumulative  density  function  (Eq.  6) 

•  The  derivation  of  the  time  (Eq  7) 

The  cdf  of  Weibull  is: 

P(x  <  t)  =  p  =  1  -  exp(-(((t  -  y)  /  (3)“))  (Eq.  6) 

.•.l-p  =  exp(-(((t-y)/pr)) 

.•.in(i-p)  =  -(((t-y)/pr) 

.•.-ln(l-p)  =  (((t-y)/p)“) 

.•.(-ln(l-p))  ''“  =  (t-y)/p 
.•.p(-ln(l-p))  '^“  =  t-y 

.•.p(-ln(l-p))''“  +  y  =  t  (Eq.7) 

Eq.  7  provides  the  estimated  time  for  a  given  probability  of  success  p.  Note  that  t 
and  y  should  be  expressed  in  the  same  units.  The  following  notation  applies  to  the 
algorithms  that  define  the  three  models: 

EF:  efficiency  level 

RV :  requirements  volatility  as  percentage 

CX:  complexity  in  LGC 

ym:  delay  in  months 

y:  delay  in  days 

All  the  algorithms  can  be  used  to  obtain  t  given  EF,  RV,  CX,  and  e  (the 
probability  of  being  correct);  or  to  obtain  e  given  EF,  RV,  CX,  and  t  (a  given  day  in  the 
future). 
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Model  1:  This  model  can  be  used  when  the  requirements  volatility  is  small. 
Algorithm  Model  1 : 

i.  If  (EF  >  2.0)  then  begin 

a  =  1.95; 

ym  =  0.28  *  (13  *  In(LGC)  -  82); 
end 

else  begin 

a  =  2.5; 

ym  =  0.76  *  {13  *  In(LGC)  -  82); 
end; 

ii.  Y  =  ym  *  22;  //we  assume  22  working  days  per 

month 

iii.  3  =  y  /  5.5; 

iv.  p  =  1  -  exp(“(((t  -  y)  /  3)“));  //  P(x<=t) 

V.  t  =  3  *  ("InCl  s)  )  +  y;  //  time  in  days 


Model  2:  This  model  considers  the  three  factors  (EF,  RV,  and  CX),  but  it  neglects 
the  combined  effect  of  EF  and  RV. 

Algorithm  Model  2: 

i.  If  (EF  >2.0)  then  begin 

a  =  1.95; 

•ym  =  0.28  *  (13  *  In(LGC)  -  82); 
end 

else  begin 
a  =  2.5; 

ym  =  0.76  *  (13  *  In(LGC)  -  82); 
end; 

ii.  y  =  ym  *  22;  //we  assume  22  working  days  per  month 

iii.  If  (RV  >  30)  then  3  ="  7^  /  5.25  //  RV  more  than  30% 

else  3  ^7^  / 

iv.  p  =  1  -  exp(-(((t  -  y)  /  3)“));  //  P{x<=t) 

V.  t  =  3  *  (-ln(l  -  s)  )  4-  y;  //  time  in  days 
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Model  3:  This  model  considers  the  three  factors  as  well  as  the  combined  effects  of 
EF  and  RV. 


Algorithm  Models : 

i.  If  (EF  >  2.0}  then  begin 

a  =  1.95; 

-p  =  0.32  *  (13  *  ln(LGC}  -  82); 
end 

else  begin 

a  =  2.5; 

p  =  0.85  *  (13  *  ln;LGC}  -  82); 
end  ; 


ii  - 

y  =  yn  ^  22; 

//  we  assum 

e  22  v;orki 

.ng  days  per  n\onth 

xi.i. . 

If  (EF  >  2.0) 

then  p  =  y 

/  (5.71  -r 

{RV  - 

20)  *  0.046) 

else  P  =  y 

/ (5.47  - 

{RV  - 

20)  *  0.114); 

iv. 

p  =  1  -  exp (- ( 

((t  -  y)  /  P)“)} 

// 

?(x<- 

.f-  \ 

V  • 

t  =  *  (-in(l 

1 

1 

// 

time 

in  days 

These  three  models  were  tried  against  16  simulated  projects  obtaining  the  scatter 
plots  of  Fig.  5.8,  5.9  and  5.10  respectively.  Note  the  errors  as  vertical  segments  between 
the  estimated  and  real  values.  The  values  of  R,  R2  and  standard  errors  are  shown  in  Table 
5.3. 


Table  53:  Accuracy  of  the  Three  Models 


Model  1 

Model  2 

Models 

B  0.9867 

0.9890 

0.9930 

0.9736 

0.9781 

0.9862 

S^ndard  error  30  days 

27  days 

22  days 
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Figure  5.9:  Scatter  Plot  of  Model  2 
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E.  INTEGRATION  WITH  THE  EVOLUTIONARY  SOFTWARE  PROCESS 


The  evolutionary  prototyping  software  process  (Fig.  5.1 1)  is  a  directed  graph  with 
two  cycles.  Initially,  the  analysts  collect  a  set  of  issues,  which  represent  concerns  and 
preliminary  goals  of  the  customers,  and  transform  them  into  a  more  elaborated  level  of 
description  called  requirements  using  a  requirements  analysis  step. 


Figure  5.11:  The  Evolutionary  Prototyping  Software  Process.  The  vertices  in 
the  graph  are  represented  by  rectangles.  The  arcs  labeled  with 
circles  represent  the  edges  of  the  digraph. 


The  requirements  are  transformed  into  specifications,  probably  in  PSDL,  during 
the  specification  design  step.  In  the  module  implementation  step  the  specifications  are 
automatically  converted  into  code  using  an  appropriate  CASE  tool  such  CAPS.  The 
program  integration  step  transforms  the  modules  obtained  by  the  generator  are  integrated 
into  a  program,  possibly  adding  code  created  by  programmers  and  reusable  components. 
This  step  includes  integration  testing  and  debugging.  The  program  is  demonstrated  to  the 
customer  in  a  prototype  demo  step  that  has  two  possible  outcomes:  a)  the  customer  is  not 
satisfied  and  introduces  criticisms,  or  b)  the  product  conforms  the  needs  and  expectations 
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of  the  customer.  In  the  first  case,  the  process  continues  by  analyzing  the  criticisms  during 
an  issue  analysis  step  that  produces  new  issues  closing  the  external  cycle  in  the  graph.  In 
the  second  case,  the  prototype  contains  all  the  required  functionality,  so  a  set  of 
optimizations  is  introduced  during  a  product  implementation  step.  The  resulting  product 
is  presented  again  to  the  customer  during  a  product  demo  step  closing  the  internal  cycle 
of  the  graph. 

It  is  required  the  introduction  of  a  new  vertex  in  the  graph  to  contain  the  risk 
assessment  step.  A  risk  assessment  step  can  be  automatically  done  after  the  completion  of 
the  specifications.  From  the  specifications  it  is  feasible  to  derive  the  complexity  of  the 
product.  This  information  is  used  together  with  personnel  and  organizational  information, 
and  with  metrics  of  requirements  collected  from  the  baselines,  to  produce  the  risk 
assessment.  The  risk  assessment  step  integrates  these  measures  with  issues  in  the  issue 
analysis  steps  (Fig.  5.12). 

The  development  life  cycle  can  be  visualized  a  succession  of  prototyping 
developments  with  increasing  functionality  followed  by  a  final  optimization  that 
produces  the  system.  Each  of  these  phases  has  the  same  activity  pattern,  so  its  reasonable 
to  suppose  that  the  delivery  time  for  each  one  has  a  probability  distribution  from  the 
Weibull  but  with  different  parameters. 

During  each  phase  a  certain  number  of  problem  events  occur.  A  problem  event  is 
an  effort-consuming  situation  that  introduces  a  certain  amount  of  functional  complexity 
to  be  solved  (caused  by  a  new  requirement,  a  change  on  a  requirement,  or  as  the 
consequence  of  rework),  and  a  certain  amount  of  information  exchange. 
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Figure  5.12  The  Proposed  Improvement 


It  is  supposed  that  the  occurrence  of  problem  events  in  each  phase  follows  a 
Poisson  distribution  with  different  mean  (X)  for  each  phase.  So,  the  entire  development 
life  cycle  is  a  non-homogeneous  Poisson  process  (Fig.  5.13).  The  assumption  of  this 
distribution  is  based  on  the  following  reasoning: 

•  There  exists  a  certain  rate  of  occurrence  of  events. 

•  The  probability  of  more  than  one  event  occurring  in  a  time  interval  depends 
on  the  length  of  the  interval. 

•  The  number  of  events  during  one  time  interval  is  independent  of  the  number 
received  prior  this  time  interval. 
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Figure  5.13:  The  Development  Life  Cycle.  The  shadow  represents  the  non- 
homogeneous  Poisson  process  of  the  problem  events.  The  curves 
represent  the  Weibull  probability  distributions  for  the  development  time 
of  each  phase. 
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VI.  CONCLUSIONS 


This  thesis  introduced  a  formal  model  to  assess  the  risk  of  software  projects  based 
on  metrics  automatically  collectable  firom  the  project  baseline.  The  model  enables  a 
project  manager  to  evaluate  the  probability  of  success  of  the  project  very  early  in  the  life 
cycle.  The  problem  of  subjectivity  in  risk  assessment  is  addressed  by  using  a  formal 
method.  Any  decision-maker  will  arrive  to  the  same  estimations,  independently  of  his 
expertise. 

A  second  benefit  of  this  approach  is  that  the  model  is  an  estimation  tool  for  time 
and  effort,  which  improves  the  state  of  the  art.  The  model  addresses  the  weaknesses  of 
current  standards  for  estimation  because  the  constraint  of  frozen  requirements,  existent  in 
COCOMO  81,  COCOMO II  and  Putnam,  is  not  an  issue  in  this  model. 

Finally,  the  research  has  been  addressed  using  simulations  and  a  small  set  of  real 
projects.  It  is  necessary  to  conduct  a  survey  with  a  large  set  of  real  projects  to  confirm  the 
results. 
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APPENDIX  A 

FORMAL  DEFINITION  OF  THE 
RELATIONAL  HYPERGRAPH  MODEL 


Definition  1:  Directed  hypergraph  (Ham,  1999f).  A  directed  hypergraph  is  a  tuple  H  = 
(N,  E,  I,  O)  where  N  is  a  set  of  nodes,  E  is  a  set  of  hyperedges,  I  is  a  function  giving  the 
set  of  input  nodes  of  each  hyperedge,  and  O  is  a  function  giving  the  output  nodes  of  each 
hyperedge. 


Definition  2:  Path  (Ham,  1999f).  A  path  p  from  node  ni  to  node  nk  is  a  sequence  of 
hyperedges  ei, . . .,  ek-i  (k>0),  and  a  sequence  of  nodes  ni, . . .,  nk. 

Definition  3:  Acyclic  hypergraph  (Ham,  1999f).  A  hypergraph  H  =  (N,  E,  I,  O)  is  acyclic 
if  and  only  if  there  is  no  path  from  any  node  in  H  to  itself 


Definition  4:  Reachable  (Ham,  1999 f).  A  set  N  of  nodes  is  reachable  from  a  set  R  of 
nodes  if  and  only  if  there  is  path  to  each  node  n  €  N  from  some  node  r  €  R.  A 
hypergraph  H,  is  reachable  from  a  set  R  of  its  nodes,  if  and  only  if  all  its  nodes  are 
reachable  fi-om  R.  The  root  of  the  hypergraph  H  is  a  node  from  which  H  is  reachable.  A 
leaf  of  H  is  a  node  from  which  no  other  node  is  reachable. 

Definition  5:  Composite  node  and  composite  edge  (Ham,  1999f).  A  composite  node  is  a 
set  of  nodes,  and  a  composite  edge  is  a  set  of  edges. 
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Definition  6;  Hypergraph  set  (Ham,  1999f).  The  hypergraph  set  is  the  union  of  nodes  and 
edges  of  a  set  of  hypergraphs. 


Definition  7:  Minimal  hypergraph  (Ham,  1999f).  Let  Nin  and  Nout  be  input  and  output 
nodes  of  a  hyperedge  e  in  the  hypergraph  H  =  (N,  E,  I,  0).  A  minimal  hypergraph  Hm  = 
(Nin Nout,  {e},  I,  O)  is  a  minimal  unit  of  the  hypergraph  whose  edge  set  has  only  one 
edge  e,  and  where  Njn  =  1(e)  and  Nout=  0(e). 


Definition  8:  Refinement  of  a  composite  node  (Ham,  1999f).  Let  H  =  (N,  E,  I,  O).  The 
refinement  of  a  composite  node  n  e  N  is  a  directed  minimal  hypergraph  Hm  =  (Nin  Nout, 
{e},  I,  O),  where  the  input  node  set  Nin  =  {nl, nn},  the  output  node  set  Nout  =  {n},  and 
the  edge  set  is  {e}.  The  edge  e  is  called  decomposition  edge  and  relates  the  refinement 
node  with  its  decomposition. 


Definition  9:  Opposite  hypergraph  (Ham,  1999f).  Let  H  =  (N,  E,  I,  O)  then  its  opposite 
hypergraph  H"’’  =  (N,  E,  0, 1). 


Definition  10:  Hyperpath  (Ham,  1999f).  A  hyperpath  in  the  hypergraph  H  =  (N,  E,  I,  O), 
is  the  minimal  hypergraph  fi-om  a  set  of  nodes  N  i  to  another  set  of  nodes  N2  where  N 1  c 
NandN2cN. 


Definition  11:  Refinement  of  a  composite  edge  (Ham,  1999f).  Let  H  =  (N,  E,  I,  O).  The 
refinement  of  a  composite  edge  e  =  ei, . . .,  en  where  e  e  E,  is  a  hypergraph  set  of  minimal 
hypergraphs  R  =  (Nin  ^  Nout,  e,  I,  O).  Nin  =  1(e),  Nom  =  0(e),  and  ei,  ...,  en  are  called 
subedges. 
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Definition  12:  Refinement  of  a  minimal  hypergraph  (Ham,  1999f).  Let  Hm  =  (Njn  u  Nom, 
{e},  I,  O)  be  a  minimal  hypergraph.  The  refinement  of  a  minimal  hypergraph  is  a 
hypergraph  set  R  =  Hin  Hout  He,  where  Hin  is  a  refinement  of  Njn,  Houtis  a  refinement 
of  Nout,  and  He  is  a  refinement  of  e.  Hm  can  be  viewed  as  a  graph  composed  by  two  nodes 
(Nin>  Nout)  and  one  edge  (e)  where  Nin  and  Nom  are  hypergraphs  and  e  is  hyperedge. 


Definition  13:  Evolutionary  hypergraph  (Ham,  1999f).  An  evolutionary  hypergraph  is  a 
labeled,  directed,  and  acyclic  hypergraph  H  =  (N,  E,  I,  O)  together  with  label  functions 
that  give  component  attributes  to  the  nodes  and  step  attributes  to  the  edges. 


Definition  14:  Top-level  evolution  step  (Ham,  1999f).  A  hyperedge  is  called  top-level 
evolution  step  if  there  are  no  parent  evolution  steps. 


Definition  15:  Atomic  evolution  step  (Ham,  1999f).  An  atomic  evolution  step  is  an 
atomic  (non  decomposable)  edge. 


Definition  16:  Top-level  evolutionary  hypergraph  (Ham,  1999f).  A  top-level  evolutionary 
hypergraph  is  an  evolutionary  hypergraph  which  its  edges  are  top-level  evolution  steps. 


Definition  17:  Atomic  evolutionary  hypergraph  (Ham,  1999f).  An  atomic  evolutionary 
hypergraph  is  an  evolutionary  hypergraph  with  an  atomic  evolution  step  as  its  hyperedge. 


Definition  18:  Primary  input  (Ham,  1999f).  Primary  inputs  are  different  versions  of  the 
output  component  of  an  evolutionary  step. 


Definition  19:  Secondary  inputs  (Ham,  1999f).  Secondary  inputs  are  all  other  input 
components  required  in  an  evolutionary  step  that  are  not  primary  inputs. 


Definition  20:  Primary-input-driven  hypergraph  (Ham,  1999f).  An  evolutionary 
hypergraph  is  called  primary-input-driven  if  and  only  if  its  nodes  are  primary  inputs. 


Definition  21:  Secondary-input-driven  hypergraph  (Ham,  1999f).  An  evolutionary 
hypergraph  is  called  secondary-input-driven  if  and  only  if  its  nodes  are  secondary  inputs. 


Definition  22:  Relational  hypergraph  (Ham,  1999f).  A  relational  hypergraph  is  an 
evolutionary  hypergraph  in  which  the  dependency  relationships  between  components  and 
steps  can  have  a  hierarchy  of  specialized  interpretations. 


Definition  23:  Software  prototyping  demo  step  (Ham,  1999f).  A  software  prototyping 
demo  step  is  a  step  in  which  the  input  components  are  a  set  of  criticisms  (Cl),  a  set  of 
programs  (P),  a  set  test  scenarios  (T),  and  a  set  of  stakeholders  (U),  producing  an  output 
component  set  of  criticisms  (C2). 


Definition  24:  Issue  analysis  step  (Ham,  1999f).  A  issue  analysis  step  is  a  step  in  which 
the  input  components  are  a  set  of  previous  issues  (Jl),  a  set  of  stakeholders  (U),  a  set  of 
criticisms  (C),  producing  an  output  component  set  of  new  issues  (J2). 
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Definition  25:  Requirement  analysis  step  (Ham,  1999f).  A  requirement  analysis  step  is  a 
step  in  which  the  input  components  are  a  set  of  previous  requirements  (Rl),  a  set  of 
issues  (J),  a  set  of  stakeholders  (U),  producing  an  output  component  set  of  new 
requirements  (R2). 


Definition  26;  Specification  design  step  (Ham,  1999f).  A  specification  design  step  is  a 
step  in  which  the  input  components  are  a  set  of  previous  specifications  (SI),  a  set  of 
stakeholders  (U),  a  set  of  requirements  (R),  producing  an  output  component  set  of  new 
specifications  (S2). 


Definition  27;  Module  implementation  step  (Ham,  1999f).  A  module  implementation  step 
is  a  step  in  which  the  input  components  are  a  set  of  previous  modules  (Ml),  a  set  of 
stakeholders  (U),  a  set  of  specifications  (S),  producing  an  output  component  set  of  new 
modules  (M2). 


Definition  28:  Program  integration  step  (Ham,  1999f).  A  program  integration  step  is  a 
step  in  which  the  input  components  are  a  set  of  previous  programs  (PI),  a  set  of 
stakeholders  (U),  a  set  of  modules  (M),  producing  an  output  component  set  of  new 
programs  (P2). 


Definition  29:  Software  product  demo  step  (Ham,  1999f).  A  software  product  demo  step 
is  a  step  in  which  the  input  components  are  a  set  of  previous  optimizations  (Kl),  a  set  of 
stakeholders  (U),  a  set  of  programs  (P),  a  set  of  test  scenarios  (T),  producing  an  output 
component  set  of  new  optimizations  (K2). 
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Definition  30:  Software  product  implementation  step  (Ham,  1999f).  A  software  product 
implementation  step  is  a  step  in  which  the  input  components  are  a  set  of  previous 
versions  of  programs  (PI),  a  set  of  stakeholders  (U),  a  set  of  optimizations  (K),  producing 
an  output  component  set  of  new  programs  (P2). 


Definition  31:  Software  prototyping  evolution  process  (Ham,  1999f).  A  software 
prototyping  evolution  step  is  a  hypergraph  with  a  path  with  the  following  properties: 

(1)  Steps  are  software  prototype  or  product  demo,  issue  analysis,  requirement 
analysis,  specification  design,  module  implementation  and  program  integration. 

(2)  Nodes  are  old  version  programs,  criticisms,  issues,  requirements, 
specifications,  modules,  and  new  version  programs. 


Definition  32:  Software  product  generation  process  (Ham,  1999f).  A  software  product 
process  is  a  relational  hypergraph  with  a  path  with  the  following  properties: 

(1)  Steps  are  software  prototype  or  product  demo,  and  program  integration. 

(2)  Nodes  are  new  version  prototypes  or  old  version  programs,  optimizations, 
and  new  version  programs. 


Definition  33:  Software  evolution  process  (Ham,  1999f).  A  software  evolution  process  is 
a  relational  hypergraph  with  a  combined  structure  of  software  prototyping  evolution 
processes  and  software  product  generation  processes. 


Definition  34:  Top-level  relational  hypergraph  net  (Ham,  1999f).  A  top-level  relational 
hypergraph  is  a  set  composed  by  a  set  of  primary  inputs,  one  or  more  sets  of  secondary 
inputs,  and  a  set  of  output  nodes  to  a  top-level  evolution  step.  (Ham,  1999f)  called  this  is 
concept  SPIDER  (Step  Processed  in  Different  Entrance  Relationships). 


102 


Definition  35:  Atomic  relational  hypergraph  net  (Ham,  1999f).  An  atomic  relational 
hypergraph  is  a  set  composed  by  a  set  of  primary  inputs,  one  or  more  sets  of  secondary 
inputs,  and  a  set  of  output  nodes  to  an  atomic  evolution  step.  (Ham,  1999f)  called  this  is 
concept  atomic  SPIDER. 
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APPENDIX  B 


ANALYSIS  WITH  ORGANIZATIONAL  CONSULTANT 


The  following  reports  were  produced  using  Organizational  Consultant  expert  system.  The 
first  report  analyze  a  fictive  organization  ’’Software  Engineering”  which  represents  a 
typical  public  software  development  department  below  CMM  level  3.  The  second  report 
analyzes  the  same  organization  after  reaching  CMM  level  3. 


REPORT  SUMMARY  -  Software  Engineering 

Time:  2:17:22  PM,  12/29/99 
Scenario:  Scenario  1 

INPUT  DATA  SUMMARY 

The  description  below  summarizes  and  interprets  your  answers  to  the  questions  about  your  organization 
and  its  situation.  It  states  your  answers  concerning  the  organization's  current  configuration,  complexity, 
formalization,  and  centralization.  Your  responses  to  the  various  questions  on  the  contingencies  of  age,  size, 
technology,  environment,  management  style,  cultural  climate  and  strategy  factors  are  also  given.  The 
writeup  below  summarizes  the  input  data  for  the  analysis. 

-  Software  Engineering  has  an  adhocracy  configuration  (cf  100). 

-  Software  Engineering  has  a  small  number  of  different  jobs  (cf  100). 

-  Of  the  employees  at  Software  Engineering  76  to  100  %  have  an  advanced  degree  or  many  years  of  special 
training  (cf  100). 

-  Software  Engineering  has  3  to  5  vertical  levels  separating  top  management  from  the  bottom  level  of  the 
organization  (cf  100). 

-  The  mean  number  of  vertical  levels  is  3  to  5  (cf  100). 

-  Software  Engineering  has  1  or  2  separate  geographic  locations  (cf  100). 

-  Software  Engineering's  average  distance  of  these  separate  units  from  the  organization's  headquarters  is  of 
no  relevance  because  there  is  only  one  site  undetermined  (cf  1 00). 

-  An  undetermined  number  of  Software  Engineering's  total  workforce  is  located  at  these  separate  units  (cf 

100). 

-  Job  descriptions  are  available  for  none  or  an  undetermined  number  of  employees  (cf  100). 

-  Where  written  job  descriptions  exist,  the  employees  are  supervised  an  undetermined  manner  to  ensure 
compliance  with  standards  set  in  the  job  description  (cf  100). 

-  The  employees  are  allowed  to  deviate  in  an  undetermined  way  from  the  standards  (cf  100). 

-  0  to  20  %  non-managerial  employees  are  given  written  operating  instructions  or  procedures  for  their  job 
(cf  100). 

-  The  written  instructions  or  procedures  given  are  of  no  relevance  as  there  are  no  written  instructions  or 
they  may  be  undetermined  (cf  100). 

-  Supervisors  and  middle  managers  are  to  some  extent  free  from  rules,  procedures,  and  policies  when  they 
make  decisions  (cf  100). 

-  Less  than  20  %  of  all  the  rules  and  procedures  that  exist  within  the  organization  are  in  writing  (cf  100). 

-  Top  Management  is  to  a  great  extent  involved  in  gathering  the  information  they  will  use  in  making 
decisions  (cf  100). 

-  Top  management  participates  in  the  interpretation  of  61  to  80  %  of  the  information  input  (cf  100). 

-  Top  management  directly  controls  21  to  40  %  of  the  decisions  executed  (cf  100). 

-  The  typical  middle  manager  has  little  discretion  over  establishing  his  or  her  budget  (cf  100). 

-  The  typical  middle  manager  has  little  discretion  over  how  his/her  unit  will  be  evaluated  (cf  100). 

-  The  typical  middle  manager  has  little  discretion  over  the  hiring  and  firing  of  personnel  (cf  100). 
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-  The  typical  middle  manager  has  little  discretion  over  personnel  rewards  -  (ie,  salary  increases  and 
promotions)  (cf  100). 

-  The  typical  middle  manager  has  some  discretion  over  purchasing  equipment  and  supplies  (cf  100). 

-  The  typical  middle  manager  has  little  discretion  over  establishing  a  new  project  or  program  (cf  100). 

-  The  typical  middle  manager  has  very  great  discretion  over  how  work  exceptions  are  to  be  handled  (cf 
100). 

-  Software  Engineering  has  25  employees  (cf  100). 

"  Software  Engineering’s  age  is  young  (cf  100). 

-  Software  Engineering's  ownership  status  is  public  (cf  100). 

-  Software  Engineering  has  some  different  products  (cf  1 00). 

-  Software  Engineering  has  few  different  markets  (cf  100). 

-  Software  Engineering  only  operates  in  one  country  (cf  100). 

-  Software  Engineering  has  no  different  products  in  the  foreign  markets  (cf  100). 

-  Software  Engineering’s  major  activity  is  categorized  as  service  (cf  100). 

-  Software  Engineering  has  a  specialized  customer-oriented  service  technology  (cf  1 00). 

-  Software  Engineering  has  undetermined  technology  (cf  100). 

-  Software  Engineering’s  technology  is  undetermined  with  respect  to  divisibility  (cf  100). 

-  Software  Engineering’s  technology  dominance  is  strong  (cf  100). 

-  Software  Engineering  has  given  no  information  about  a  possible  advanced  information  system  (cf  100). 

-  Software  Engineering’s  environment  is  complex  (cf  100). 

-  The  uncertainty  of  Software  Engineering’s  environment  is  high  (cf  100). 

-  The  equivocality  of  the  organization’s  environment  is  high  (cf  100). 

-  Software  Engineering’s  environment  has  an  undetermined  level  of  hostility  (cf  100). 

"  Top  management  prefers  to  make  resource  allocations  and  detailed  operating  decisions  (cf  100). 

-  Top  management  primarily  prefers  to  make  long-term  decisions  (cf  100). 

-  Top  management  has  a  preference  for  very  aggregate  information  when  making  decisions  (cf  100). 

-  Top  management  has  a  preference  for  some  proactive  actions  and  some  reactive  actions  (cf  100). 

-  Top  management  is  risk  averse  (cf  100). 

-  Top  management  has  a  preference  for  a  combination  of  motivation  and  control  (cf  100). 

-  Software  Engineering  operates  in  an  industry  with  a  medium  capital  requirement  (cf  100). 

-  Software  Engineering  has  a  high  product  innovation  (cf  1 00). 

-  Software  Engineering  has  a  high  process  innovation  (cf  100). 

-  Software  Engineering  has  a  high  concern  for  quality  (cf  100). 

-  Software  Engineering’s  price  level  is  undetermined  relative  to  its  competitors  (cf  100). 

-  The  level  of  trust  is  high  (cf  100). 

-  The  level  of  conflict  is  low  (cf  100). 

-  The  employee  morale  is  not  known  (cf  100). 

-  Rewards  are  given  in  a  not  known  fashion  (cf  100). 

-  The  resistance  to  change  is  not  known  (cf  100). 

-  The  leader  credibility  is  high  (cf  100). 

-  The  level  of  scapegoating  is  low  (cf  100). 

THE  SIZE 

The  size  of  the  organization  -  large,  medium,  or  small  -  is  based  upon  the  number  of  employees,  adjusted 
for  their  level  of  education  or  technical  skills. 

Based  on  the  answers  you  provided,  it  is  most  likely  that  your  organization’s  size  is  medium  (cf  50). 

More  than  75  %  of  the  people  employed  by  Software  Engineering  have  a  high  level  of  education. 
Adjustments  are  made  to  this  effect.  The  adjusted  number  of  employees  is  lower  than  500  but  greater  than 
100  and  Software  Engineering  is  categorized  as  medium.  However,  for  this  adjusted  number  this  size  does 
not  have  a  major  effect  on  the  organizational  structure. 

THE  CLIMATE 

The  organizational  climate  effect  is  the  summary  measure  of  people  and  behavior. 
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Based  on  the  answers  you  provided,  it  is  most  likely  that  the  organizational  climate  is  a  group  climate  (cf 
76). 

It  could  also  be  that  climate  is  a  developmental  (cf  73). 

The  group  climate  is  characterized  as  a  friendly  place  to  work  where  people  share  a  lot  of  themselves.  It  is 
like  an  extended  family.  The  leaders,  or  head  of  the  organization,  are  considered  to  be  mentors  and,  perhaps 
even  parent  figures.  The  organization  is  held  together  by  loyalty  or  tradition.  Commitment  is  high.  The 
organization  emphasizes  the  long-term  benefit  of  human  resource  development  with  high  cohesion  and 
morale  being  important.  Success  is  defined  in  terms  of  sensitivity  to  customers  and  concern  for  people.  The 
organization  places  a  premium  on  teamwork,  participation,  and  consensus. 

When  the  organization  has  a  high  level  of  trust  it  is  likely  that  the  organization  has  a  group  climate.  An 
organization  with  little  conflict  can  be  categorized  to  have  group  climate.  High  leader  credibility 
characterizes  an  organization  with  a  group  climate.  An  organization  with  a  low  level  of  scapegoating  may 
have  a  group  climate. 

The  developmental  climate  is  characterized  as  a  dynamic,  entrepreneurial  and  creative  place  to  work. 
People  stick  their  necks  out  and  take  risks.  The  leaders  are  considered  to  be  innovators  and  risk  takers.  The 
glue  that  holds  organizations  together  is  commitment  to  experimentation  and  innovation.  The  emphasis  is 
on  being  on  the  leading  edge.  Readiness  for  change  and  meeting  new  challenges  are  important.  The 
organization’s  long-term  emphasis  is  on  growth  and  acquiring  new  resources.  Success  means  having  unique 
and  new  products  or  services  and  being  a  product  or  service  leader  is  important.  The  organization 
encourages  individual  initiative  and  freedom. 

When  the  organization  has  a  high  to  medium  level  of  trust  it  is  likely  that  the  organization  has  a 
developmental  climate.  An  organization  with  low  level  of  conflict  can  be  categorized  to  have  a 
developmental  climate.  Medium  to  high  leader  credibility  characterizes  an  organization  with  a 
developmental  climate.  An  organization  with  a  medium  level  of  scapegoating  may  have  a  developmental 
climate. 

THE  MANAGEMENT  STYLE 

The  level  of  management's  microinvolvement  in  decision  making  is  the  summary  measure  of  management 
style.  Leaders  have  a  low  preference  for  microinvolvement;  managers  have  a  high  preference  for 
microinvolvement. 

Based  on  the  answers  you  provided,  it  is  most  likely  that  your  management  profile  has  a  medium 
preference  for  microinvolvement  (cf  78). 

The  management  of  Software  Engineering  has  a  preference  for  letting  some  decisions  be  made  by  other 
managers.  This  will  lead  toward  a  medium  preference  for  microinvolvement.  The  management  of  Software 
Engineering  has  a  preference  for  taking  actions  on  some  decisions  and  being  reactive  toward  others.  This 
will  lead  toward  a  medium  preference  for  microinvolvement.  Management  has  a  preference  for  using  both 
motivation  and  control  to  coordinate  the  activities,  which  leads  toward  a  medium  preference  for 
microinvolvement. 

THE  STRATEGY 

The  organization's  strategy  is  categorized  as  one  of  either  prospector,  analyzer  with  innovation,  analyzer 
without  innovation,  defender,  or  reactor.  These  categories  follow  Miles  and  Snow's  typology.  Based  on 
your  answers,  the  organization  has  been  assigned  to  a  strategy  category.  This  is  a  statement  of  the  current 
strategy;  it  is  not  an  analysis  of  what  is  the  best  or  preferred  strategy  for  the  organization. 

Based  on  the  answers  you  provided,  it  is  most  likely  that  your  organization's  strategy  is  an  analyzer  with 
innovation  strategy  (cf  68). 

It  could  also  be:  a  prospector  (cf  64). 

An  organization  with  an  analyzer  with  innovation  strategy  is  an  organization  that  combines  the  strategy  of 
the  defender  and  the  prospector.  It  moves  into  the  production  of  a  new  product  or  enters  a  new  market  after 
viability  has  been  shown.  But  in  contrast  to  an  analyzer  without  innovation,  it  has  innovations  that  run 
concurrently  with  the  regular  production.  It  has  a  dual  technology  core. 

An  organization  with  a  medium  capital  investment  is  likely  to  have  some  capabilities  rather  fixed,  but  can 
also  adjust.  The  analyzer  with  innovation  which  seeks  new  opportunities  but  also  maintains  its  profitable 
position  is  appropriate.  With  a  concern  for  high  quality  an  analyzer  with  innovation  strategy  is  a  likely 
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strategy  for  Software  Engineering.  With  top  management  preferring  a  medium  level  of  microinvolvement 
top  management  wants  some  influence.  This  can  be  obtained  via  control  over  current  operations.  Product 
innovation  should  be  less  controlled.  The  strategy  is  therefore  likely  to  be  analyzer  with  innovation. 

An  organization  with  a  prospector  strategy  is  an  organization  that  continually  searches  for  market 
opportunities  and  regularly  experiments  with  potential  responses  to  emerging  environmental  trends.  Thus, 
the  organization  is  often  the  creator  of  change  and  uncertainty  to  which  its  competitors  must  respond. 
However,  because  of  its  strong  concern  for  product  and  market  innovation,  a  prospector  usually  is  not 
completely  efficient. 

With  a  concern  for  high  quality  a  prospector  strategy  is  a  likely  strategy  for  Software  Engineering. 

THE  CURRENT  ORGANIZATIONAL  CHARACTERISTICS 

Based  on  your  answers,  the  organization’s  complexity,  formalization,  and  centralization  have  been 
calculated.  This  is  the  current  organization.  Later  in  this  report,  there  will  be  recommendations  for  the 
organization. 

The  current  organizational  complexity  is  medium  (cf  100). 

The  current  horizontal  differentiation  is  medium  (cf  100). 

The  current  vertical  differentiation  is  low  (cf  100). 

The  current  spatial  differentiation  is  low  (cf  100). 

The  current  centralization  is  medium  (cf  100). 

The  current  formalization  is  low  (cf  100). 

The  current  organization  has  been  categorized  with  respect  to  formalization,  centralization,  and  complexity. 
The  categorization  is  based  on  the  input  you  gave  and  does  not  take  missing  information  into  account. 

SITUATION  MISFITS 

A  situation  misfit  is  an  unbalanced  situation  among  the  contingency  factors  of  management  style,  size, 
environment,  technology,  climate,  and  strategy. 

The  following  misfits  are  present:  (cf  100). 

Software  Engineering  has  both  an  analyzer  strategy  and  few  products.  Generally,  more  products  are 
required  for  an  analyzer.  A  few  products  may  be  reasonable  in  the  short  run,  but  an  analyzer  should  be  in 
constant  consideration  of  new  possibilities.  When  a  few,  unchanging  products  become  the  norm,  the 
analyzer  should  broaden  its  scope  of  new  opportunities. 

ORGANIZATIONAL  CONSULTANT  RECOMMENDATIONS 

Based  on  your  answers  about  the  organization,  its  situation,  and  the  conclusions  with  the  greatest  certainty 
factor  from  the  analyses  above  Organizational  Consultant  has  derived  recommendations  for  the 
organization’s  configuration,  complexity,  formalization,  and  centralization.  There  are  also 
recommendations  for  coordination  and  control,  the  appropriate  media  richness  for  communications,  and 
incentives.  More  detailed  recommendations  for  possible  changes  in  the  current  organization  are  also 
provided. 

ORGANIZATIONAL  CONFIGURATIONS 

The  most  likely  configuration  that  best  fits  the  situation  has  been  estimated  to  be  a  matrix  configuration  (cf 
59). 

A  matrix  structure  is  a  structure  that  assigns  specialists  from  functional  departments  to  work  on  one  or 
more  interdisciplinary  teams  that  are  led  by  project  leaders.  Permanent  product  teams  are  also  possible.  A 
dual  hierarchy  manages  the  same  activities  and  individuals  at  the  same  time. 

When  Software  Engineering's  environment  has  neither  low  equivocality  nor  low  complexity,  the 
configuration  should  be  matrix.  When  Software  Engineering  is  of  medium  size,  the  configuration  can  be  a 
matrix  configuration.  The  matrix  configuration  is  a  more  likely  configuration  when  Software  Engineering 
has  a  unit  production  technology. 
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ORGANIZATIONAL  CHARACTERISTICS 


The  recommended  degree  of  organizational  complexity  is  medium  (cf  43). 

Medium  size  organizations  should  have  medium  organizational  complexity.  Top  management  of  Software 
Engineering  has  a  preference  for  a  medium  level  of  microinvolvement,  which  drives  the  organizational 
complexity  towards  medium.  A  group  climate  in  the  organization  requires  a  medium  level  of  complexity 
with  a  low  level  of  vertical  differentiation. 

The  recommended  degree  of  horizontal  differentiation  is  low  (cf  28). 

It,  too,  could  be:  medium  (cf  19). 

The  recommended  degree  of  vertical  differentiation  is  low  (cf  38). 

The  recommended  degree  of  formalization  is  medium  (cf  48). 

There  should  be  some  formalization  between  the  organizational  units  but  less  formalization  within  the  units 
due  to  the  high  professionalization.  Software  Engineering  has  a  medium  capital  requirement,  which  leads  to 
medium  formalization.  Medium  size  organizations  should  have  medium  formalization.  Medium 
formalization  is  consistent  with  the  leadership  style  when  top  management’s  preference  for 
microinvolvement  is  neither  very  great  nor  very  low. 

The  recommended  degree  of  centralization  is  medium  (cf  45). 

Software  Engineering  has  an  analyzer  with  innovation  strategy.  Centralization  should  be  medium.  There 
should  be  tight  control  over  current  activities  and  looser  control  over  new  ventures.  Software  Engineering 
is  of  medium  size.  Such  organizations  should  have  medium  to  high  centralization.  Medium  centralization  is 
recommended  when  top  management  has  neither  a  great  desire  nor  very  little  desire  for  microinvolvement. 
Software  Engineering's  span  of  control  should  be  narrow  (cf  30). 

It,  too,  at  places  should  be  moderate  (cf  25). 

Since  Software  Engineering  has  a  nonroutine  technology,  it  should  have  a  narrow  span  of  control. 

Software  Engineering  should  use  media  with  high  media  richness  (cf  85). 

The  information  media  that  Software  Engineering  uses  should  provide  a  large  amount  of  information  (cf 
85). 

Incentives  should  be  based  on  results  (cf  85). 

Software  Engineering  should  use  an  undetermined  process  as  means  for  coordination  and  control  (cf  100). 
When  the  environment  of  Software  Engineering  has  high  equivocality,  high  uncertainty,  and  high 
complexity,  coordination  and  control  should  be  obtained  through  integrators  and  group  meetings.  The 
richness  of  the  media  should  be  high  with  a  large  amount  of  information.  Incentives  must  be  results  based. 
Coordination  is  a  major  issue  requiring  a  lot  of  time  by  functional  managers  and  product  or  project 
managers.  Managers  should  make  frequent  adjustments  in  order  to  maintain  project  and  product  goals  and 
use  scarce  functional  resources  and  personnel  efficiently.  In  an  international  firm,  matrix  dimensions  will 
likely  include  country  or  region  and  may  include  either  product,  customer,  or  function.  Project  or  product 
managers  will  likely  be  required  to  champion  new  innovations  in  customers,  products  or  technologies. 
When  the  organization  has  a  group  climate,  coordination  should  be  obtained  using  integrators  and  group 
meetings.  Incentives  could  be  results  based  but  with  a  group  orientation.  An  organization  with  a  group 
climate  will  likely  have  to  process  a  large  amount  of  information  and  will  need  information  media  with 
high  richness. 


ORGANIZATIONAL  MISFITS 

Organizational  misfits  compares  the  recommended  organization  with  the  current  organization. 
The  following  organizational  misfits  are  present:  (cf  100). 

Current  and  prescribed  configuration  do  not  match. 

Current  and  prescribed  formalization  do  not  match. 


MORE  DETAILED  RECOMMENDATIONS 

There  are  a  number  of  more  detailed  recommendations  (cf  100). 
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You  may  consider  increasing  the  number  of  positions  for  which  job  descriptions  are  available. 
You  may  consider  supervising  the  employees  more  closely. 

You  may  consider  allowing  employees  less  latitude  from  standards. 

You  may  consider  more  written  job  descriptions. 

Managerial  employees  may  be  asked  to  follow  written  instructions  and  procedures  more  closely. 
You  may  consider  having  more  written  rules  and  procedures. 

END 


REPORT  SUMMARY  -  Software  Engineering 

Time:  2:40:37  PM,  12/29/99 
Scenario:  Scenario  2 

INPUT  DATA  SUMMARY 

The  description  below  summarizes  and  interprets  your  answers  to  the  questions  about  your  organization 
and  its  situation.  It  states  your  answers  concerning  the  organization’s  current  configuration,  complexity, 
formalization,  and  centralization.  Your  responses  to  the  various  questions  on  the  contingencies  of  age,  size, 
technology,  environment,  management  style,  cultural  climate  and  strategy  factors  are  also  given.  The 
writeup  below  summarizes  the  input  data  for  the  analysis. 

-  Software  Engineering  has  a  matrix  configuration  (cf  100). 

-  Software  Engineering  has  a  small  number  of  different  jobs  (cf  100). 

-  Of  the  employees  at  Software  Engineering  76  to  100  %  have  an  advanced  degree  or  many  years  of  special 
training  (cf  100). 

“  Software  Engineering  has  3  to  5  vertical  levels  separating  top  management  from  the  bottom  level  of  the 
organization  (cf  100). 

-  The  mean  number  of  vertical  levels  is  3  to  5  (cf  100). 

-  Software  Engineering  has  1  or  2  separate  geographic  locations  (cf  100). 

-  Software  Engineering’s  average  distance  of  these  separate  units  from  the  organization’s  headquarters  is  of 
no  relevance  because  there  is  only  one  site  undetermined  (cf  100). 

-  An  undetermined  number  of  Software  Engineering’s  total  workforce  is  located  at  these  separate  units  (cf 

100). 

-  Job  descriptions  are  available  for  operational  employees,  low  and  middle  management  (cf  100). 

-  Where  written  job  descriptions  exist,  the  employees  are  supervised  closely  to  ensure  compliance  with 
standards  set  in  the  job  description  (cf  100). 

-  The  employees  are  allowed  to  deviate  a  moderate  amount  from  the  standards  (cf  100). 

“81  to  100  %  non-managerial  employees  are  given  written  operating  instructions  or  procedures  for  their 
job  (cf  100). 

-  The  written  instructions  or  procedures  given  are  followed  to  a  great  extent  (cf  100). 

“  Supervisors  and  middle  managers  are  to  a  little  extent  free  from  rules,  procedures,  and  policies  when  they 
make  decisions  (cf  100). 

-  More  than  80  %  of  all  the  rules  and  procedures  that  exist  within  the  organization  are  in  writing  (cf  100). 

-  Top  Management  is  to  some  extent  involved  in  gathering  the  information  they  will  use  in  making 
decisions  (cf  100). 

“  Top  management  participates  in  the  interpretation  of  41  to  60  %  of  the  information  input  (cf  100). 

-  Top  management  directly  controls  0  to  20  %  of  the  decisions  executed  (cf  100). 

-  The  typical  middle  manager  has  some  discretion  over  establishing  his  or  her  budget  (cf  100). 

-  The  typical  middle  manager  has  some  discretion  over  how  his/her  unit  will  be  evaluated  (cf  100). 

-  The  typical  middle  manager  has  great  discretion  over  the  hiring  and  firing  of  personnel  (cf  100). 

“  The  typical  middle  manager  has  some  discretion  over  personnel  rewards  -  (ie,  salary  increases  and 
promotions)  (cf  100). 

“  The  typical  middle  manager  has  some  discretion  over  purchasing  equipment  and  supplies  (cf  100). 

-  The  typical  middle  manager  has  some  discretion  over  establishing  a  new  project  or  program  (cf  100). 
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-  The  typical  middle  manager  has  very  great  discretion  over  how  work  exceptions  are  to  be  handled  (cf 
100). 

-  Software  Engineering  has  25  employees  (cf  100). 

-  Software  Engineering’s  age  is  young  (cf  100). 

-  Software  Engineering's  ownership  status  is  public  (cf  100). 

-  Software  Engineering  has  few  different  products  (cf  1 00). 

-  Software  Engineering  has  few  different  markets  (cf  100). 

-  Software  Engineering  only  operates  in  one  country  (cf  100). 

-  Software  Engineering  has  no  different  products  in  the  foreign  markets  (cf  100). 

-  Software  Engineering’s  major  activity  is  categorized  as  service  (cf  100). 

-  Software  Engineering  has  a  specialized  customer-oriented  service  technology  (cf  100). 

-  Software  Engineering  has  a  medium  routine  technology  (cf  1 00). 

-  Software  Engineering's  technology  is  highly  divisible  (cf  100). 

-  Software  Engineering’s  technology  dominance  is  strong  (cf  100). 

-  Software  Engineering  has  either  planned  or  already  has  an  advanced  information  system  (cf  100). 

-  Software  Engineering’s  environment  is  complex  (cf  100). 

-  The  uncertainty  of  Software  Engineering’s  environment  is  high  (cf  100). 

-  The  equivocality  of  the  organization’s  environment  is  high  (cf  100). 

-  Software  Engineering’s  environment  has  an  undetermined  level  of  hostility  (cf  100). 

-  Top  management  prefers  to  make  policy  and  general  resource  allocation  decisions  (cf  100). 

-  Top  management  primarily  prefers  to  make  long-term  decisions  (cf  100). 

-  Top  management  has  a  preference  for  very  aggregate  information  when  making  decisions  (cf  100). 

-  Top  management  has  a  preference  for  some  proactive  actions  and  some  reactive  actions  (cf  100). 

-  Top  management  is  risk  averse  (cf  100). 

-  Top  management  has  a  preference  for  high  control  (cf  1 00). 

-  Software  Engineering  operates  in  an  industry  with  a  medium  capital  requirement  (cf  100). 

-  Software  Engineering  has  a  high  product  innovation  (cf  100). 

-  Software  Engineering  has  a  high  process  innovation  (cf  100). 

-  Software  Engineering  has  a  high  concern  for  quality  (cf  100). 

-  Software  Engineering's  price  level  is  undetermined  relative  to  its  competitors  (cf  100). 

-  The  level  of  trust  is  high  (cf  100). 

-  The  level  of  conflict  is  low  (cf  100), 

-  The  employee  morale  is  high  (cf  100). 

-  Rewards  are  given  in  a  inequitably  fashion  (cf  100). 

-  The  resistance  to  change  is  not  known  (cf  100). 

-  The  leader  credibility  is  high  (cf  100). 

-  The  level  of  scapegoating  is  low  (cf  100). 

THE  SIZE 

The  size  of  the  organization  -  large,  medium,  or  small  -  is  based  upon  the  number  of  employees,  adjusted 
for  their  level  of  education  or  technical  skills. 

Based  on  the  answers  you  provided,  it  is  most  likely  that  your  organization’s  size  is  medium  (cf  50). 

More  than  75  %  of  the  people  employed  by  Software  Engineering  have  a  high  level  of  education. 
Adjustments  are  made  to  this  effect.  The  adjusted  number  of  employees  is  lower  than  500  but  greater  than 
100  and  Software  Engineering  is  categorized  as  medium.  However,  for  this  adjusted  number  this  size  does 
not  have  a  major  effect  on  the  organizational  structure. 

THE  CLIMATE 

The  organizational  climate  effect  is  the  summary  measure  of  people  and  behavior. 

Based  on  the  answers  you  provided,  it  is  most  likely  that  the  organizational  climate  is  a  group  climate  (cf 
82). 

It  could  also  be  the  that  climate  is  a  developmental  (cf  80). 
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The  group  climate  is  characterized  as  a  friendly  place  to  work  where  people  share  a  lot  of  themselves.  It  is 
like  an  extended  family.  The  leaders,  or  head  of  the  organization,  are  considered  to  be  mentors  and,  perhaps 
even  parent  figures.  The  organization  is  held  together  by  loyalty  or  tradition.  Commitment  is  high.  The 
organization  emphasizes  the  long-term  benefit  of  human  resource  development  with  high  cohesion  and 
morale  being  important.  Success  is  defined  in  terms  of  sensitivity  to  customers  and  concern  for  people.  The 
organization  places  a  premium  on  teamwork,  participation,  and  consensus. 

When  the  organization  has  a  high  level  of  trust  it  is  likely  that  the  organization  has  a  group  climate.  An 
organization  with  little  conflict  can  be  categorized  to  have  group  climate.  Employees  with  a  high  morale  is 
one  element  of  group  climate.  High  leader  credibility  characterizes  an  organization  with  a  group  climate. 
An  organization  with  a  low  level  of  scapegoating  may  have  a  group  climate. 

The  developmental  climate  is  characterized  as  a  dynamic,  entrepreneurial  and  creative  place  to  work. 
People  stick  their  necks  out  and  take  risks.  The  leaders  are  considered  to  be  innovators  and  risk  takers.  The 
glue  that  holds  organizations  together  is  commitment  to  experimentation  and  innovation.  The  emphasis  is 
on  being  on  the  leading  edge.  Readiness  for  change  and  meeting  new  challenges  are  important.  The 
organization’s  long-term  emphasis  is  on  growth  and  acquiring  new  resources.  Success  means  having  unique 
and  new  products  or  services  and  being  a  product  or  service  leader  is  important.  The  organization 
encourages  individual  initiative  and  freedom. 

When  the  organization  has  a  high  to  medium  level  of  trust  it  is  likely  that  the  organization  has  a 
developmental  climate.  An  organization  with  low  level  of  conflict  can  be  categorized  to  have  a 
developmental  climate.  Employees  with  a  high  morale  is  frequently  one  element  of  a  developmental 
climate.  Medium  to  high  leader  credibility  characterizes  an  organization  with  a  developmental  climate.  An 
organization  with  a  medium  level  of  scapegoating  may  have  a  developmental  climate. 

THE  MANAGEMENT  STYLE 

The  level  of  management’s  microinvolvement  in  decision  making  is  the  summary  measure  of  management 
style.  Leaders  have  a  low  preference  for  microinvolvement;  managers  have  a  high  preference  for 
microinvolvement. 

Based  on  the  answers  you  provided,  it  is  most  likely  that  your  management  profile  has  a  low  preference  for 
microinvolvement  (cf  72). 

It  could  also  be  that  your  management  profile  has  an  inappropriate  preference  (cf  70). 

It  could  also  be  that  your  management  profile  has  a  high  preference  (cf  69). 

The  management  of  Software  Engineering  has  a  preference  for  delegating  decisions.  This  will  lead  toward 
a  low  preference  for  microinvolvement.  Management  has  a  long-term  horizon  when  making  decisions, 
which  characterizes  a  preference  for  a  low  microinvolvement.  Since  the  management  has  a  preference  for 
making  decisions  on  the  basis  of  very  aggregate  information  a  low  preference  for  microinvolvement 
characterization  is  appropriate. 

The  management  dimensions  are  not  in  balance.  This  is  likely  to  result  in  an  ineffectual  individual. 
Management  is  risk  averse.  This  is  one  of  the  characteristics  of  a  manager  with  a  high  preference  for 
microinvolvement.  Management  has  a  preference  for  using  control  to  coordinate  activities,  which  leads 
toward  a  high  preference  for  microinvolvement. 

THE  STRATEGY 

The  organization's  strategy  is  categorized  as  one  of  either  prospector,  analyzer  with  innovation,  analyzer 
without  innovation,  defender,  or  reactor.  These  categories  follow  Miles  and  Snow’s  typology.  Based  on 
your  answers,  the  organization  has  been  assigned  to  a  strategy  category.  This  is  a  statement  of  the  current 
strategy;  it  is  not  an  analysis  of  what  is  the  best  or  preferred  strategy  for  the  organization. 

Based  on  the  answers  you  provided,  it  is  most  likely  that  your  organization’s  strategy  is  a  prospector 
strategy  (cf  73). 

It  could  also  be:  a  defender  (cf  72). 

It  could  also  be:  an  analyzer  with  innovation  (cf  72). 

An  organization  with  a  prospector  strategy  is  an  organization  that  continually  searches  for  market 
opportunities  and  regularly  experiments  with  potential  responses  to  emerging  environmental  trends.  Thus, 
the  organization  is  often  the  creator  of  change  and  uncertainty  to  which  its  competitors  must  respond. 
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However,  because  of  its  strong  concern  for  product  and  market  innovation,  a  prospector  usually  is  not 
completely  efficient. 

With  a  concern  for  high  quality  a  prospector  strategy  is  a  likely  strategy  for  Software  Engineering.  With  top 
management  preferring  a  relatively  low  level  of  microinvolvement,  the  strategy  is  likely  to  be  prospector. 
An  organization  with  a  defender  strategy  is  an  organization  that  has  a  narrow  product  market  domain.  Top 
managers  in  this  type  of  organization  are  expert  in  their  organization’s  limited  area  of  operation  but  do  not 
tend  to  search  outside  their  domains  for  new  opportunities.  As  a  result  of  this  narrow  focus,  these 
organizations  seldom  need  to  make  major  adjustments  in  their  technology,  structure,  or  methods  of 
operation.  Instead,  they  devote  primary  attention  to  improving  the  efficiency  of  their  existing  operations. 
Software  Engineering  has  few  products.  It  needs  to  defend  these  products  well  in  the  marketplace.  Viability 
depends  on  being  successful  with  these  limited  activities.  With  a  concern  for  high  quality  a  defender 
strategy  is  a  likely  strategy  for  Software  Engineering. 

An  organization  with  an  analyzer  with  innovation  strategy  is  an  organization  that  combines  the  strategy  of 
the  defender  and  the  prospector.  It  moves  into  the  production  of  a  new  product  or  enters  a  new  market  after 
viability  has  been  shown.  But  in  contrast  to  an  analyzer  without  innovation,  it  has  innovations  that  run 
concurrently  with  the  regular  production.  It  has  a  dual  technology  core. 

An  organization  with  a  medium  capital  investment  is  likely  to  have  some  capabilities  rather  fixed,  but  can 
also  adjust.  The  analyzer  with  innovation  which  seeks  new  opportunities  but  also  maintains  its  profitable 
position  is  appropriate.  For  a  medium  routine  technology.  Software  Engineering  has  some  flexibility.  It  is 
consistent  with  an  analyzer  with  innovation  strategy.  With  a  concern  for  high  quality  an  analyzer  with 
innovation  strategy  is  a  likely  strategy  for  Software  Engineering. 

THE  CURRENT  ORGANIZATIONAL  CHARACTERISTICS 

Based  on  your  answers,  the  organization’s  complexity,  formalization,  and  centralization  have  been 
calculated.  This  is  the  current  organization.  Later  in  this  report,  there  will  be  recommendations  for  the 
organization. 

The  current  organizational  complexity  is  medium  (cf  100). 

The  current  horizontal  differentiation  is  medium  (cf  100). 

The  current  vertical  differentiation  is  low  (cf  100). 

The  current  spatial  differentiation  is  low  (cf  100). 

The  current  centralization  is  medium  (cf  100). 

The  current  formalization  is  high  (cf  100). 

The  current  organization  has  been  categorized  with  respect  to  formalization,  centralization,  and  complexity. 
The  categorization  is  based  on  the  input  you  gave  and  does  not  take  missing  information  into  account. 


SITUATION  MISFITS 

A  situation  misfit  is  an  unbalanced  situation  among  the  contingency  factors  of  management  style,  size, 
environment,  technology,  climate,  and  strategy. 

The  following  misfits  are  present:  (cf  100). 

Software  Engineering  has  both  a  prospector  strategy  and  a  risk  adverse  management.  This  strategy 
conflicts  with  the  management’s  risk  adverse  attitude.  A  prospector  strategy  demands  a  projection  into  the 
unknown  with  new  and  innovative  products  and  services,  where  the  returns  are  uncertain.  A  risk  adverse 
management  will  be  very  uncomfortable  with  this  high  level  of  risk.  Risk  adverse  managers  prefer 
situations  with  less  uncertainty.  It  is  possible  to  either  change  the  prospector  strategy  or  hire  more  risk 
assuming  managers.  Usually  a  risk  adverse  management  will  control  expenditures  to  reduce  or  eliminate 
the  prospector  projects.  If  the  environment  and  markets  call  for  a  prospector  strategy,  a  new  management 
would  be  preferable.  Some  risk  adverse  managers  can  adapt,  but  it  is  very  difficult. 

Software  Engineering  has  both  a  prospector  strategy  and  not  many  products  or  markets.  The  prospector  will 
create  a  broad  range  of  new  possible  products  and  services,  which  requires  a  large  number  of  possible 
products  and  markets.  A  prospector  requires  variety  to  explore  and  find  new  products  and  markets  for  its 
innovations.  With  limited  product  and  market  opportunity,  the  range  of  prospector  possibilities  may  exceed 
the  environmental  possibilities.  The  prospector  needs  to  seek  new  markets  as  well  as  new  products.  If  the 
markets  do  not  exist  or  cannot  be  created,  the  prospector  will  incur  high  costs  of  innovation  without  return. 
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Software  Engineering  has  a  group  climate.  This  is  a  mismatch  with  a  prospector  strategy!  A  group  climate 
has  low  resistance  to  change.  A  prospector  strategy  is  committed  to  changes. 

ORGANIZATIONAL  CONSULTANT  RECOMMENDATIONS 

Based  on  your  answers  about  the  organization,  its  situation,  and  the  conclusions  with  the  greatest  certainty 
factor  from  the  analyses  above  Organizational  Consultant  has  derived  recommendations  for  the 
organization’s  configuration,  complexity,  formalization,  and  centralization.  There  are  also 
recommendations  for  coordination  and  control,  the  appropriate  media  richness  for  communications,  and 
incentives.  More  detailed  recommendations  for  possible  changes  in  the  current  organization  are  also 
provided. 

ORGANIZATIONAL  CONFIGURATIONS 

The  most  likely  configuration  that  best  fits  the  situation  has  been  estimated  to  be  an  adhocracy 
configuration  (cf  68). 

It  is  certainly  not:  a  professional  bureaucracy  (cf  -73). 

It  is  certainly  not:  a  machine  bureaucracy  (cf  -73). 

An  adhocracy  organization  is  normally  an  organization  with  high  horizontal  differentiation,  low  vertical 
differentiation,  low  formalization,  decentralization,  and  great  flexibility  and  responsiveness. 

An  adhocracy  configuration  is  appropriate  when  neither  the  environmental  equivocality  of  Software 
Engineering  nor  the  environmental  uncertainty  is  low.  When  the  organization  is  also  young,  the  conclusion 
that  it  should  bean  adhocracy  is  further  strengthened.  Since  top  management  has  a  low  preference  for 
microinvolvement,  the  ad  hoc  configuration  is  feasible.  However,  the  size  of  the  organization  is  not  very 
important  for  the  choice  of  an  adhocracy  configuration.  A  prospector  like  Software  Engineering  should  be 
configured  as  an  ad  hoc  organization.  An  organization  with  a  group  climate  could  have  an  ad  hoc 
configuration. 

Since  the  organization  has  a  prospector  strategy,  it  cannot  have  a  configuration  like  a  professional 
bureaucracy. 

When  the  organization  has  a  prospector  strategy,  it  cannot  be  a  machine  bureaucracy! 

ORGANIZATIONAL  CHARACTERISTICS 

The  recommended  degree  of  organizational  complexity  is  medium  (cf  54). 

Medium  size  organizations  should  have  medium  organizational  complexity.  Software  Engineering  has  a 
technology  that  is  somewhat  routine,  which  implies  that  the  organizational  complexity  should  be  medium. 
Because  Software  Engineering  has  an  advanced  information  system,  organizational  complexity  can  be 
greater  than  it  could  otherwise.  A  group  climate  in  the  organization  requires  a  medium  level  of  complexity 
with  a  low  level  of  vertical  differentiation. 

The  recommended  degree  of  horizontal  differentiation  is  low  (cf  34). 

It,  too,  could  be:  medium  (cf  24). 

The  recommended  degree  of  vertical  differentiation  is  low  (cf  72). 

It,  too,  could  be:  medium  (cf  62). 

The  recommended  degree  of  formalization  is  low  (cf  56). 

Software  Engineering  has  a  prospector  strategy.  A  low  formalization  is  required  so  that  the  organization 
can  react  quickly.  Low  formalization  is  also  required  because  of  the  need  for  innovations.  Since  the  set  of 
variables  in  the  environment  that  will  be  important  is  not  known  and  since  it  is  not  possible  to  predict  what 
will  happen,  no  efficient  rules  and  procedures  can  be  developed,  which  implies  that  Software  Engineering’s 
formalization  should  be  low.  Low  formalization  is  consistent  with  top  management  having  a  low  preference 
for  microinvolvement.  A  group  climate  in  the  organization  requires  a  low  level  of  formalization. 

The  recommended  degree  of  centralization  is  low  (cf  46). 

There  is  evidence  against  it  should  be:  high  (cf  -16). 

Software  Engineering  has  a  prospector  strategy.  A  low  centralization  is  required  so  that  the  organization 
can  react  and  innovate  quickly.  Since  there  are  many  factors  in  the  environment  that  affect  the  organization 
but  Software  Engineering  does  not  know  which  factors  are  or  will  be  important  for  Software  Engineering, 
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centralization  should  be  low.  Low  centralization  can  be  allowed  when  top  management  has  no  desire  for 
microinvolvement.  A  group  climate  in  the  organization  requires  a  low  level  of  centralization. 

Software  Engineering's  span  of  control  should  be  moderate  (cf  62). 

Since  Software  Engineering  has  some  technology  routineness,  it  should  have  a  moderate  span  of  control. 
Software  Engineering  should  use  media  with  high  media  richness  (cf  85). 

The  information  media  that  Software  Engineering  uses  should  provide  a  large  amount  of  information  (cf 
85). 

Incentives  should  be  based  on  results  (cf  85). 

Software  Engineering  should  use  meetings  as  means  for  coordination  and  control  (cf  94). 

When  the  environment  of  Software  Engineering  has  high  equivocality,  high  uncertainty,  and  high 
complexity,  coordination  and  control  should  be  obtained  through  integrators  and  group  meetings.  The 
richness  of  the  media  should  be  high  with  a  large  amount  of  information.  Incentives  must  be  results  based. 
An  open  organizational  climate  and  team  spirit  must  be  fostered.  Information  must  be  shared  among  all 
levels.  Constructive  conflict  on  'what  to  do’  will  be  usual.  Individual  tolerance  of  ambiguity  and  uncertainty 
will  be  necessary.  Individual  performance  evaluation  will  be  problematic  and  largely  subjective.  Mutual 
adjustments  of 'give  and  take'  will  be  the  norm.  Frequent  informal  meetings  and  temporary  task  forces  will 
be  the  primary  coordinating  devices.  When  the  organization  has  a  group  climate,  coordination  should  be 
obtained  using  integrators  and  group  meetings.  Incentives  could  be  results  based  but  with  a  group 
orientation.  An  organization  with  a  group  climate  will  likely  have  to  process  a  large  amount  of  information 
and  will  need  information  media  with  high  richness. 


ORGANIZATIONAL  MISFITS 

Organizational  misfits  compare  the  recommended  organization  with  the  current  organization. 
The  following  organizational  misfits  are  present:  (cf  100). 

Current  and  prescribed  configuration  do  not  match. 

Current  and  prescribed  centralization  do  not  match. 

Current  and  prescribed  formalization  do  not  match. 


MORE  DETAILED  RECOMMENDATIONS 

There  are  a  number  of  more  detailed  recommendations  (cf  100). 

You  may  consider  supervising  the  employees  less  closely. 

You  may  consider  fewer  written  job  descriptions. 

Managerial  employees  may  be  asked  to  pay  less  attention  to  written  instructions  and  procedures. 
You  may  give  supervisors  and  middle  managers  fewer  rules  and  procedures. 

You  may  consider  having  fewer  rules  and  procedures  put  in  writing. 


END 
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APPENDIX  C 


SIMULATION  REPORTS 

The  following  chart  (Fig  C.l)  presents  the  simulated  organization  and  the  simulated 
software  process.  The  process  presents  only  four  cycles  of  evolution.  Each  cycle  has  the 
activities  described  in  Chapter  VII  (Fig.  7.1). 


Figure  C.1:  Project  layout^ 


’  Note:  The  detailed  description  of  the  notation  can  be  foimd  on  the  ViteProject  user 
manual  (Levitt,  1999).  Rectangles  indicate  tasks.  Rounded-comer  rectangles  indicate 
roles.  Parallelograms  indicate  meetings.  Double-headed-dashed  arrows  indicate 
information  dependencies  between  tasks.  Dashed  arrows  indicate  problem  dependencies 
betw'een  tasks.  Normal  arrows  indicate  precedence  dependencies  between  tasks. 
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Simulation  Results 


Table  C.l  shows  the  expected  durations  and  the  standard  deviations  in  days  for 
the  sixteen  scenarios  simulated.  The  simulations  were  configured  to  run  30  cases  for  each 
scenario.  The  column  titles  LGC  shows  the  complexity  measure  for  each  scenario 
measured  in  LGC. 


Table  C.l:  Simulation  results 


\  Scenario 

Efficiency 

Req.  Vol. 

Complexity 

Eft)  days 

soft)  days 

LGC 

L 

1 
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88 

5 
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6 
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L 

L 

H2.5 

254 

16 

1334! 

LLHS 

L 

L 

H5 

507 

31 

3230 

tHL 

L 

H 

L 
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7 

746 

LHH 

L 

H 

H 

128 

10 

00 

LH2.5 

L 

H 

H2.5 

319 

25 

1334 

LH5 

L 

H 

H5 

638 

49 

3230 
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H 
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H 

42 
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HLH5 

L 

H5 
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14 
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HHL 

H 

. H  ' 

'  L . . 

"  . . . . . 42' 
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HHH 
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H 

H 

49 

4 

78t 

HHH2.5 . 

, . ""^22 

9 

1334 
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H 

H5 
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18 
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days 


Figure  C.2:  Effects  of  complexity 

Figure  C.2  shows  the  effects  of  the  complexity  in  the  expected  duration  of  similar 
efficiency  and  requirements  volatility  scenarios.  Observ'e  that  the  effect  of  complexity  is 
different  Avhen  the  efficiency  and  requirement  volatility  \2iy. 


Figure  C3:  Effects  of  efficiency 
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Figure  C.3  shows  the  effects  of  efficiency.  For  same  values  of  complexity  and 
requirements  volatility,  the  durations  for  high  efficiency  scenarios  w^ere  40%  of  the 
durations  for  low  efficiency  ones. 


Figure  C.4  shows  the  effects  of  requirements  volatility'.  For  same  values  of  complexity 
and  efficiency,  the  durations  for  volatile  scenarios  were  122%  of  the  durations  for  non 
volatile  ones. 
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APPENDIX  D 


PARAMETER  CONFIGURATION  FOR  VITEPROJECT 


ViteProject  uses  a  set  of  default  values  for  the  variables  of  the  model.  These 
values  are  stored  in  a  file  named  "behmatrx.opd"  in  the  subdirectory  of  ViteProject.  The 
behavior  of  the  model  depends  on  the  values  of  these  variables  that  are  collectively  called 
Behavior  Matrix.  This  Appendix  discusses  the  concepts  considered  in  the  behavior  matrix 
and  their  relationship  with  software  projects. 

(1)  Participant  attention  rule:  Defines  the  probability  distribution  applied  to  the 
different  selection  methods  (e.g.  priority,  FIFO,  LIFO,  random)  of  picking  items 
to  process. 

(2)  Participant  tool  selection  rules:  Defines  the  probability  distribution  applied  to 
different  information  exchange  tools  (e.g.  conversation,  email,  fax,  memo,  phone, 
video,  voice-mail)  given  the  type  of  message  (e.g.  Exception,  Decision,  etc.)  A 
tool  selected  for  an  information  exchange  determines  (1)  the  time  needed  for  the 
message  to  move  fi-om  one  participant  to  another  and  (2)  the  time  the  message 
will  stay  in  the  in-tray  of  the  receiver  participant. 

Findings: 

i.  Even  if  there  is  one  matrix  for  each  role,  all  the  matrices  are  identical. 

ii.  Too  much  emphasis  on  voicemail.  We  expected  more  weight  on 
conversation,  phone  and  email. 

(3)  Activity  Verification  Failure  Probability  (VFP)  adjustment:  There  are  two  VFP 
(internal  and  external).  The  internal  VFP  depends  on  the  complexity  of  the 
requirement  and  the  skills  of  the  participants.  The  external  VFP  depends  on  the 
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complexity  of  the  solution  and  the  skills  of  the  participants.  The  processing  speed 
of  responsible  participants  is  affected  by  the  solution  complexity  and  the 
requirement  complexity. 

(4)  Activity  Information  Exchange  Frequency  adjustment:  This  adjustment  depends 
on  the  uncertainty  of  the  activity  and  the  team  experience. 

(5)  Participant  Processing  Speed  adjustment:  This  adjustment  depends  on  the  match 
between  the  participant  and  activity  skill  requirements. 

(6)  Definition  of  Rework,  Quick-Fix,  and  Ignore  decisions:  This  matrix  defines  how 
much  of  the  original  failed  work  should  be  reworked,  quick-fixed  or  ignored.  The 
values  depend  on  the  following  failure  types: 

i.  Intemaljlntemal:  Amount  of  rework  of  an  activity  given  internal  activity 
failure  (based  on  VFPIntemal.). 

ii.  Intemal|Extemal:  Amount  of  rework  of  an  activity  given  external  failure 
(based  on  VFP  External.). 

iii.  Intemal|Extemal:  Amount  of  rework  of  a  failure  dependent  activity  given 
external  failure  of  an  independent  activity  (based  on  VFP  External  of  the 
independent  activity.). 

(7)  Impact  of  participant  information  exchange  behavior  on  its  VFP:  This  adjustment 
depends  on  the  attendance  or  non-attendance  of  the  participant  to  information 
exchange  events  related  to  the  activities. 

(8)  Impact  of  participant  decision-making  behavior  on  the  VFP  of  failed  activity: 
This  adjustment  depends  on  the  centralization  level  of  the  organization. 

(9)  Finally  there  is  a  set  of  matrices  to  implement  Project  Decision  Making  Policies 
including  how  to  determine  to  whom  to  report  an  exception,  how  to  make  a  decision  for 
an  exception,  what  is  the  maximum  time  a  participant  will  wait  before  it  takes  delegation 
by  default. 
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The  following  source  code  is  the  behavior  matrix  provided  as  default  by  ViteProject. 

% 


%  BehMatrx.opd  -  Vite~Project  uses  default  qualitative-to-quantitative  calibration 
%  parameter  values  defined  in  this  file.  To  override  any  of  the  default  calibration 
%  values,  place  a  modified  version  of  this  file  in  the  directory  that  holds  Vite-Project 
%  and  specify  the  file  name  in  the  Vite-Project  simulation  control  dialog  box.  Vite- 
%  Project  will  load  this  file  automatically. 

% 

%  Each  matrix  defines  an  association  set:  the  row  selection,  when  associated  with  the 
%  column  selection,  has  the  behavior  of  the  corresponding  matrix  value.  For  example,  for 
%  the  ParticipantAttentionRule,  a  Project  Manager  (PM)  will  select  an  item  from  the 
%  intray  by  Priority  with  probability  0.5.  Notation: 

%  PM  =  Project  Manager 

%  SL  =  participant  subteam  leader 

%  ST  -  participant 

% 

%  Revisions: 

%  10.17.97  Update  comments  and  values 

(Application  BehMatrices) 

%  Participant  attention  rule:  -  A  participant  uses  this  attention  rule  to  select  an  item 
%  from  its  in-tray.  By  default,  all  participants  in  Vite-Project  share  this  common 
%  attention  rule. 

%  Example:  a  Project  Manager  (PM)  will  select  an  item  from  the  intray  by  priority  with 
%  probability  0.5,  with  FIFO  with  probability  0.1,  etc. 


(Matrix  ParticipantAttentionRule 
:Row  PM  SL  ST 


) 


: Column  Priority  FIFO  LIFO  Random 

:Values  (0.4  0.3  0.2  0.1) 

(0.3  0.4  0.2  0.1) 

(0.1  0.5  0.3  0.1) 


%=  Participant  role. 

%=  Item  Selection  strategy. 

%=  Probability  corresponding  strategy 
%  will  be  applied. 


%  Participant  tool  selection  rules 

%  Information  exchange  tool  selection  is  based  on  only  Message  types  (e.g, .  Exception, 

%  Decision,  etc.)  A  tool  selected  for  an  information  exchange  determines  (1)  the  time 

%  needed  for  the  message  to  move  from  one  participant  to  another  and  (2)  the  time  the 

%  message  will  stay  in  the  in-tray  of  the  receiver  participant. 

%  Example:  Given  an  exception  to  process,  the  PM  will  never  choose  the  Phone  or  Video. 

%  Note  that  Decisions  go  directly  to  the  recipient  in-tray  without  use  of  a  information 
%  exchange  tool . 

% - 

%  This  rule  only  applies  to  project  managers 
% 

(Matrix  ToolSelectionRulesPM 

:Row  [Message  type]:  Decision  Exception  Inf ©Exchange  Meeting  Noise 


Column 

[Tool 

to  use] :  < 

Conversation  Email  Fax 

Memo  Phone  Video  VoiceMail 

Values 

(0.15 

0.20  0.20 

0.20  0.0 

0.0  0.25) 

%=  Probability 

(0.20 

0.20  0.20 

0.20  0.1 

0.0  0.10) 

%  a  specific  tool 

(0.25 

0.1  0.1 

0.15  0.25 

0.0  0.15) 

%  will  be  used 

(0.5 

0.0  0.0 

0.0  0.2 

0.0  0.3) 

(0.3 

0.1  0.05 

0.1  0.35 

0.0  0.1)  . 

) 

% 
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%  This  rule  only  applies  to  participant  leaders 
% 

(Matrix  ToolSelectionRulesSL 

:Row  Decision  Exception  InfoExchange  Meeting  Noise 

: Column  Conversation  Email  Fax  Memo  Phone  Video  VoiceMail 

rvalues  (0.15  0.20  0.20  0.20  0.0  0,0  0.25)  %=  Probability 

(0.20  0.20  0.20  0.20  0.1  0.0  0.10)  %  a  specific  tool 

(0.25  0.1  0.1  0.15  0.25  0.0  0.15)  %  will  be  used 

(0.5  0.0  0.0  0.0  0.2  0.0  0.3) 

(0.3  0.1  0.05  0.1  0.35  0.0  0.1) 

) 

% - - - 

%  This  rule  only  applies  to  sub  teams 
% 

(Matrix  ToolSelectionRulesST 

:Row  Decision  Exception  InfoExchange  Meeting  Noise 

: Column  Conversation  Email  Fax  Memo  Phone  Video  VoiceMail 

rvalues  (0.15  0.20  0.20  0,20  0.0  0.0  0.25)  %=  Probability 

(0.20  0.20  0.20  0.20  0.1  0.0  0.10)  %  a  specific  tool 

(0.25  0.1  0.10  0.15  0.25  0.0  0.15)  %  will  be  used 

(0.5  0.0  0.0  0.0  0.2  0.0  0.3) 

(0.3  0.1  0.05  0.10  0.35  0.0  0.10) 

) 

%  Activity  Verification  Failure  Probability  (VFP)  adjustment: 

%  The  formula  used  to  determine  activities’  internal  and  external  VFP: 

% 

%  ?activity . VFPexternal  = 

%  ?proj .VFPexternal  *  SolutionComplexityEf feet  *  ParticipantSkillEf f ect ; 

%  ?activity . VFPinternal  = 

%  ?proj .VFPinternal  *  RequirementComplexityEf feet  *  ParticipantSkillEf feet ; 

% 

%  The  adjustment  coefficients  (e.g.,  SolutionComplexityEf feet  ParticipantSkillEf feet) 
%  are  determined  by  values  in  the  following  matrices. 

% - 

%  Effect  of  Activity  solution  complexity  on  processing  speed  of  responsible 
%  participants. 

% 

(Matrix  SolutionComplexityEf feet 

:Row  High  Medium  Low  %=  Level  of  solution  complexity, 

rvalues  1 . 5  1 . 0  0 . 67  %-  Value  of  SolutionComplexityEf feet 

) 


% - 

%  Effect  of  Activity  requirement  complexity  on  responsible  participant  processing  speed. 
% 

(Matrix  RequirementComplexityEf feet 

:Row  High  Medium  Low  %=  Level  of  requirement  complexity, 

rvalues  1.5  1.0  0.67  %=  Value  of  RequirementComplexityEf feet 

) 

% - 

%  Effect  of  Participant-Activity  skill  match  on  activity  VFP: 

%  If  responsible  participant  skill  matches  the  skill  requirement  of  the 
%  corresponding  activity,  then  use  this  matrix  to  determine 
%  ParticipantSkillEf feet . 

% 

(Matrix  ParticipantSkillMatchVFP 

:Row  High  Medium  Low  %=  Level  of  participant  App.  Experience 

rColumn  High  Medium  Low  %=  Participant  Required  Skill  Level, 

rvalues  (0.5  0.7  0.9)  %=  Values  of  ParticipantSkillEf feet . 

(0.7  1.0  1.2) 

(0.9  1.2  1.5) 

) 

% - 

%  Effect  of  Participant-Activity  match  on  activity  VFP: 
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%  If  participant  skill  DOES  NOT  match  activity’s  skill  requirement,  then 
%  use  this  matrix  to  determine  ParticipantSkillEffect .  Failure  of 
%  participant-activity  skill  match  creates  a  major  VFP  penalty. 

% 

(Matrix  ParticipantSkillNonMatchVFP 

:Row  High  Medium  Low  %=  Level  of  participant  App.  Experience 

: Column  High  Medium  Low  %=  Participant  other  Skill  Level, 

rvalues  (2.0  2.0  2.0)  %=  Values  of  ParticipantSkillEffect. 

(2.5  2.5  2.5) 

(3.5  3.5  3.5) 

) 


%  Activity  Information  Exchange  Frequency  adjustment:  The  following  formula  is  used  to 
%  determine  probabilistic  information  exchange  frequency  of  an  activity 
% 

%  ?activity . InfoExchangeFrequency  =  ?proj . InfoExchhngeFrequency  * 

%  ActivityUncertaintyEffect  *  TeamExperienceEf feet 


%  Effect  of  Activity  uncertainty  on  information  exchange  frequency: 

% 

(Matrix  ActivityUncertaintyEffect 

:Row  High  Medium  Low  %=  Level  of  activity  uncertainty 

rvalues  1.4  1.00  0.67  %=  Value  of  ActivityUncertaintyEffect 


% - 

%  Effect  of  responsible  Participant  team  experience  on  information  exchange  frequency: 
% 

(Matrix  TeamExperienceEf feet 

:Row  High  Medium  Low  %=  Level  of  participant  team  experience, 

rvalues  0.67  1.0  1.5  %=  Value  of  TeamExperienceEf feet 

) 

%  participant  processing  speed  adjustment: 

%  The  following  formula  determines  participant  processing  speed.  Since  participant 

%  processing  speed  is  based  on  its  match  with  the  skill  requirement  of  its  assigned 

%  activity,  the  ParticipantSpeed  is  associated  with  each  activity.  (Vite-Project 

%  assumes  that  each  activity  can  have  only  ONE  responsible  participant  working  on 

%  it.) 

% 

%  ?activity .ResponsibleParticipantSpeed  = 

%  1.0  /  (?Participant.NuinberOf Participants  *  ?Participant . APSO  * 

%  ParticipantSkillEffect  *  ?Participant .TimePercentageForProject)  ; 

% 

%  The  rule  uses  1/  "time  needed  to  process  a  work  unit"  to  calculate  speed. 

% - 

%  Effect  of  Participant-Activity  match  on  APS: 

%  If  responsible  participant  skill  matches  the  activity's  skill 
%  requirement,  then  use  this  matrix  to  determine  the  value  of 
%  ParticipantSkillEffect. 

% 

(Matrix  ParticipantSkillMatchAPS 

:Row  High  Medium  Low  %=  Level  of  participant  App. Experience . 

: Column  High  Medium  Low  %-  Participant  Required  Skill  level. 

:Values  (2.0  1.5  0.9)  %=  Values  of  ParticipantSkillEffect 

(1.5  1.0  0.7) 

(0.9  0.7  0.5) 

) 

% - 

%  If  participant  skill  DOES  NOT  match  activity’s  skill  requirement,  then 
.%  use  this  matrix  to  determine  the  value  of  ParticipantSkillEffect. 

% 

(Matrix  Participants killNonMatchAPS 

:Row  High  Medium  Low  %=  Level  of  participant  App.  Experience. 
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: Column  High  Medium  Low  %=  Participant  Other  Skill  level, 

rvalues  (0.7  0.7  0.7)  %=  Values  of  ParticipantSkillEf feet 

(0.5  0.5  0.5) 

(0.3  0.3  0.3) 


%  Definition  of  Rework,  Quick-Fix,  and  Ignore  decisions: 

%  This  matrix  defines  how  much  of  the  original  failed  work  should  be  reworked  based 
%  decision  types  (i.e.,  Reworked,  Quick-Fixed,  Ignore).  The  actual  rework  volume  is  the 
%  given  subactivity  volume  *  %  -  of  failed  work  that  needs  to  be  reworked  *  user- 
%  interface  defined  "Strength"  of  failure  dependent  activity  relationship 
% 

%  The  values  change  depending  on  the  failure  types  described  below; 

% 

%  Internal ! Internal :  Amount  of  rework  of  an  activity  given  internal  activity  failure 
%  (based  on  VFPInternal . ) 

% 

%  Internal ! External :  Amount  of  rework  of  an  activity  given  external  failure  (based  on 
%  VFP  External.) 

%  Internal ! External  Amount  of  rework  of  a  failure  dependent  activity  given  external 

%  failure  of  an  independent  activity  (based  on  VFP  External  of  the  independent 

%  activity.) 

% 


% 


(Matrix  ReworkVolume 

:Row  Internal  Internal ! External  External ! External  %=  failure  type 

.•Column  Rework  Quick-Fix  Ignore  %=  Decision  for  the  exception 

rvalues  (1.0  0.5  0.0)  %=  Percent  of  failed  work 

(1.0  0.5  0.0)  %  that  needs  to  be  reworked. 

(1.0  0.5  0.0) 


%  Impact  of  participant  information  exchange  behavior  on  its  VFP: 

%  Vit^-Project  simulates  the  impact  of  participant  information  exchange  behavior  on  its 
%  VFP  by  updating  VFP  based  on  the  effect  weight  as  shown  below  (same  for  VFPexternal 
%  and  VFPinternal) : 

% 

%  ?activity. VFPinternal  =  ?activity. VFPinternal  *  VFPInfoXEf feet ; 

%  if  ?activity. VFPinternal  >  1.0; 

%  then  ?activity. VFPinternal  =  1.0; 

% 

%  The  value  of  VFPInfoXEf feet  is  retrieved  from  the  following  matrices. 

% 

%  VFP  updating  is  dynamic,  i.e.,  it  happens  whenever  an  information  exchange  finishes. 

%  You  can  disable  the  effects  by  setting  matrix  values  to  1.0. 


% - 

%  This  matrix  defines  the  weight  for  updating  participant  verification  failure 
probabilities  (***  Internal  and  External)  due  to  not  attending  to  information  exchange 
with  peers,  meetings  and  noise  respectively. 

%  NOTE:  Weight  =1.0  implies  no  effect  of  ignoring  communications,  meetings  or  noise. 

% 


(Matrix  ParticipantNonAttendanceFailureEf feet 

:Row  InfoXNonAttend  MeetNonAttend  NoiseNonAttend 

%=  Nonatt  InfoX  type 

: Column.  High  Medium  Low  %=  Level  of  formalization 

rvalues  (1.01  1.07  1.1)  %=  VFPInfoXEf feet . 

.  (1.10  1.07  1.05) 

(1.0  1.00  1.00) 


^ - 

%  This  matrix  defines  the  weight  for  updating  participant  verification  failure 
%  probability  due  to  attending  to  information  exchange  from  peers,  meetings 
%  and  noise  respectively. 

% 


126 


(Matrix  ParticipantAttendanceFailureEf feet 


) 


:Row  InfoXAttend  MeetAttend  NoiseAttend  %=  Nonatt  InfoX  type 

: Column  High  Medium  Low  %=  Level  of  formalization 

:  Values  (0.99  0.96  0.95)  %=  VFPInfoXEf feet . 

(0.90  0.96  0.99) 

(1.0  1.0  1.0) 


%  Impact  of  participant  decision-making  behavior  on  the  VFP  of  failed  activity: 

%  Vite-Project  simulates  the  impact  of  participant  information  exchange  behavior  on  its 
%  VFP  updating  VFP  based  on  the  effect  weight  as  shown  below 
%  (same  for  VFPexternal) : 

% 

%  ?activity.VFPinternal  =  ?activity . VFPinternal  *  VFPInfoXEf feet; 

%  if  ?activity. VFPinternal  >1,0; 

%  then  ?activity. VFPinternal  =  1.0; 

% 

%  The  value  of  VFPInfoXEf feet  is  retrieved  from  the  following  matrices,  based 
%  decision-maker's  role  and  the  type  of  decision  it  has  made. 

% 

%  VFP  updating  is  dynamic,  i.e.,  it  happens  whenever  a  decision  is  made. 

% 

%  You  can  turn  off  the  effects  by  setting  values  of  the  matrices  to  1.0. 


% - 

%  Matrix  used  for  Low  centralization: 


(Matrix  LowCentralDecisionWeight 

•  ^ovj  PM  SL  ST  %=  Decision-maker's  role. 

: Column  Rework  Quick-Fix  Ignore  %=  Type  of  decision  made. 

: Values  (0.95  1.0  1.05)  %=  VFPInfoXEf feet  for  update  VFP 

(0.95  1.0  1.05) 

(0.95  1.0  1.05) 


) 


% - - - 

%  Matrix  used  for  Medium  centralization: 

% 

(Matrix  MediumCentralDecisionWeight 

:Row  PM  SL  ST  %=  Decision-maker's  role. 

: Column  Rework  Quick-Fix  Ignore  %=  Type  of  decision  made. 

:Values  (0.9  0.95  1.05)  %-  VFPInfoXEf feet  for  update  VFP 

(0.95  1.0  1.05) 

(0.95  1.05  1.1) 


% - 

%  Matrix  used  for  High  centralization: 
% 


(Matrix  HighCentralDecisionWeight 

•Row  PM  SL  ST  %=  Decision-maker's  role. 

: Column  Rework  Quick-Fix  Ignore  %=  Type  of  decision  made. 

: Values  (0.9  0.95  1.05)  %=  VFPInfoXEf feet  for  update  VFP 

(0.95  1.0  1.1) 

(0.95  1.1  1.2) 


%  Following  matrices  are  used  to  implement  Project  Decision  Making  Policies 
%  including  how  to  determine  to  whom  to  report  an  exception,  how  to  make 
%  a  decision  for  an  exception,  what  is  the  maximum  time  a  participant  will 
%  wait  before  it  takes  delegation  by  default. 

% 


% - 

%  Time  To  Wait  For  Decision  Policy: 

%  This  matrix  defines  how  long  a  participant  should  wait  for  a  decision 
%  before  it  assumes  delegation  by  default.  Participants  playing  different 
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%  roles  in  the  organization  may  have  different  time-out  durations. 
% 

(Matrix  TimeToWaitForDecision 

:Row  PM  SL  ST  %=  Participant  roles 

rvalues  480  %=  Time-out  duration  in  minutes 

960 
960 

) 


% - 

%  Decision  Maker  Policy: 

%  This  matrix  is  used  by  a  participant  to  determine  who  should  make 
%  decision  for  his/her  exception,  based  on  project's  centralization 
%  policy.  The  assumption  is  that  more  centralized  project  teams 
%  requires  higher  level  participants  make  decisions  for  exceptions. 
% 


(Matrix  DecisionMakerPolicy 
:Row  PM  SL  ST 


) 


: Column  High  Medium  Low 

rvalues  (0.6  0.2  0.1) 

(0.3  0.6  0.3) 

(0.1  0.2  0.6) 


%=  Decision  maker's  role 

%=  Level  of  centralization 
%=  Probability 

%  a  certain  role  should 
%  make  the  decision. 


% - 

%  Decision  Choice  Policy: 

%  This  matrix  is  used  by  a  decision-maker  to  determine  how  an  exception  should  be 
dealt  with,  based  on  project's  centralization  policy.  NOTE:  The  assumption  is  that 
higher  level  participants  (e.g.,  project  managers)  tend  to  make  more  Rework  decisions. 
Vite  experience  has  found  this  assumption  reasonable  for  routine  engineering  design. 
However,  for  domains  like  software  engineering,  Vit6  staff  has  found  that  the  reverse  is 
true.  Participants  (hackers)  want  to  fix  every  known  bug,  whereas  managers  want  to  ship 
on  time,  even  with  known,  non-serious  bugs.  This  matrix  should  be  adjusted  to  reflect 
the  "bug  fixing”  culture  of  the  organization  being  modeled. 

% 


(Matrix  DecisionChoicePolicy 

:Row  PM  SL  ST  %=  Decision-maker’s  role 

r Column  Rework  Quick-Fix  Ignore  %=  Decision  type 
rvalues  (0.65  0.3  0.05)  %=  Probability 

(0.4  0.4  0.2)  %  the  decision-maker  will 

(0 . 05  0 . 35  0 . 6)  %  make  a  certain  type  of  decision 


) 


%  Information  exchange  Probability  adjustment: 

%  The  following  matrices  adjust  the  frequency  probability  of  different  types  of 
information  exchange  based  on  the  Level  of  project  Formalization: 

% 

%  ?AdjustedInfoXProbability  =  OriginalCommunicationProbability  *  AdjustFactor; 

% 

%  The  Info  Exchange  AdjustFactor  is  retrieved  from  the  following  matrix  given  the  level 
of  formalization. 


%  NOTE:  Meeting  frequency  is  not  adjustable  in  Vit^-Project,  so  the  Meet  row  of  the 
matrices  is  not  meaningful. 

% 


% - 

%  This  matrix  defines  the  VFP  adjustment  factor  for  different  types  of  information 
exchange . 

% 


(Matrix  CoordinationDistribution 
:Row  InfoX  Meet  Noise 

: Column  High  Medium  Low 

:Values  (0.5  1.0  2.0) 

(0.7  1.0  1.0) 

(1.0  1.0  1.0) 


%=  Information  exchange  type 
%=  Level  of  formalization 
%=  Info  Exchange  AdjustFactor. 


%  ====:==============  =========  =  =  ==  =  =  =  ====  ==^:.==  =  :=r=.===:==  =  =  =  =:===  ===: 
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%  In  Vite-Project,  when  a  participant  picks  up  an  information  exchange  item,  it  has  to 
%  decide  whether  to  attend  the  request  for  information  exchange.  This  matrix  defines  the 
%  chance  a  participant  attends  to  a  given  type  of  information  exchange  given  a  level  of 
%  strength  of  organization  matrix. 

% 

%  e.g.,  if  Matrix  Strength  is  High  (as  in  a  Project  organization),  then  a  participant 
%  will  probabilistically  attend  to  80%  of  information  exchanges,  and  20%  of  the  meetings 
%  and  20%  of  the  Noise.  Project  organizations  have  high  Matrix  strength;  functional  teams 
%  have  low  matrix  strength. 

% 


(Matrix  CoordinationPriority 

:Row  InfoX  Meet  Noise 

: Column  High  Medium  Low 

rvalues  (0.9  0.7  0.6) 

(0.6  0.7  0.9) 

(0.2  0.2  0.2) 

) 

% 

%  Communications-related  matrices 
% 


%=  Type  of  information  exchange 
%=  Org  Matrix  Strength 
%=  Probability 
%  a  participant  will  attend 
%  a  communication. 


% - 

%  This  matrix  defines  the  length  of  time  (in  minutes)  it  takes  to 
%  deliver  messages  using  different  communication  tools 
% 


(Matrix  ToolTimeToDeliver 

:Row  Conversation  Email  Fax  Memo  Phone  VideoConf  VoiceMail  %=  Communication 

tool 

rvalues  10 
1 
1 
5 
1 
1 
1 

) 


% - 

%  This  matrix  defines  the  length  of  time  (in  minutes)  it  takes  for 
%  messages  to  expire  in  the  recipients  in-tray 
% 

(Matrix  ToolTimeToExpire 

rRow  Conversation  Email  Fax  Memo  Phone  VideoConf  VoiceMail  %=  Communication 

tool 

rvalues  60 

2400 

1440 

2400 

5 

10 

960 

) 


% - - - - 

%  This  matrix  defines  the  volume  (in  minutes)  for  each  type  of  message 
% 

(Matrix  MessageVolume 

rRow  PM  SL  ST  %=  Recipients  role 

r Column  decision  exception  info_exchange  meeting  noise  %-  Message  type 

rvalues  (10  120  30  0  10) 

(10  240  30  0  10) 

(10  240  30  0  10) 
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%%%%%%%%%%%%%%% 
%  END  OF  FILE  % 
%%%%%%%%%%%%%%% 


APPENDIX  E 


STATISTICAL  ANALYSIS  OF  SIMULATION  OUTPUTS 

A.  Descriptive  statistics  and  box  plots 


LLL 

LLH 

LHL 

LHH 

HLL 

HLH 

HHL 

HHH 

Mean 

Standard 

88 

507.3333 

100.8667 

638 

32,23333 

209 

41.56667 

244 

Error 

0.91977 

5.732211 

1.323296 

8.939773 

0,334538 

2.530276 

0.495226 

3.342516 

Median 

88 

507.5 

100.5 

635 

32 

205 

41.5 

240 

Mode 

Standard 

88 

535 

96 

575 

31 

200 

39 

230 

Deviation 

5.037788 

31.39661 

7.247988 

48.96515 

1.83234 

13.85889 

2.712466 

18.30771 

Kurtosis 

0.103871 

-1.16404 

-0.23193 

-0.17195 

-0.95402 

-1.13356 

-0.91119 

-0.56513 

Skewness 

0.130034 

-0.00625 

0.562342 

0.203825 

0.062831 

0.138658 

0.216263 

0.600225 

Range 

22 

105 

28 

210 

6 

50 

10 

70 

Minimum 

78 

455 

91 

540 

29 

185 

37 

215 

Maximum 

100 

560 

119 

750 

35 

235 

47 

285 

Count 

30 

30 

30 

30 

30 

30 

30 

30 

Cl  (95.0%) 

1.881142 

11.72369 

2.706445 

18.2839 

0.684208 

5.174999 

1.012852 

6.836217 

The  descriptive  statistics  do  not  give  conclusive  information  about  the  kind  of 
distribution  observed.  The  boxplots  show  that  complexity  (the  third  variable)  has  the 
strongest  influence  over  the  development  time,  efficiency  seems  to  have  less  impact,  and 
requirements  volatility  seems  to  have  moderate  influence. 
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B.  Weibull  probability  plots 

The  data  obtained  from  the  simulations  was  analyzed  with  a  statistical  software 
package  from  Reliasoft.  The  product  checks  what  is  the  distribution  function  that  better 
fits  the  sample.  The  distributions  compared  were  exponential  (one  and  two  parameters), 
Weibull  (two  and  three  parameters),  normal,  and  lognormal.  In  all  the  cases  the  tool 
fotmd  that  Weibull  with  three  parameters  was  the  best  fit.  The  following  plot  is  a  Weibull 
paper  and  shows  the  data  points  as  icons  and  the  distribution  function  recommended  by 
the  tool  as  lines. 
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Probability  distribution  functions 


Probability  Density  Function 
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D.  Effect  of  requirements  volatility 


The  following  graphs  show  the  influence  of  requirements  volatility.  Two  graphs 
are  presented  to  discriminate  the  cases  of  high  and  low  efficiency  in  order  to  avoid 
confoimding  factors. 
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p4=2.65,  t|4=4.92.  r4=27.87 


pl=2.80.Til=l39J9.  Yl  =513.88 
P2=I. 83. 1)2=14.07.  y2=88J9 
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High  efficiency 


Low  efficiency 


In  both  cases  the  increment  of  volatility  produces  a  shift  to  the  right.  This  shift  is 
magnified  when  the  complexity  is  high.  The  effect  is  also  magnified  when  the  efficiency 
is  low. 
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E.  Effect  of  efficiency 

The  following  graphs  show  the  influence  of  efficiency.  Two  graphs  are  presented 
to  discriminate  the  cases  of  high  and  low  complexity  in  order  to  avoid  confounding 
factors. 


Probability  Density  Function  Probability  Density  Function 
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High  complexity 


Low  complexity 


In  both  cases  the  increment  on  efficiency  produces  a  shift  to  the  left.  This  shift  is 
magnified  when  the  complexity  is  high.  The  effect  is  also  magnified  when  the  volatility  is 
high. 
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F.  Effect  of  complexity 


The  following  graphs  show  the  influence  of  complexity.  Two  graphs  are 
presented  to  discriminate  Ae  cases  of  high  and  low  requirements  volatility  in  order  to 
avoid  confounding  factors'. 
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In  both  cases  the  increment  on  complexity  produces  a  shift  to  the  right.  This  shift  is 
magnified  when  the  efficiency  is  low.  The  effect  is  also  magnified  when  the  volatility  is 
high. 
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G.  Cumulative  density  functions  and  stochastic  dominance 


Figure  x.4:  Comparison  of  the  cdfs  for  the  different  scenarios. 


As  expected,  for  same  level  of  complexity  high  efficiency  scenarios  have 
stochastic  have  stochastic  dominance. 
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H.  Contour  of  time 


One  of  the  difficulties  in  visualizing  the  model  is  that  it  has  four  variables 
(efficiency,  requirements  volatility,  complexity,  and  time),  hence  it  is  necessary  a  five 
dimensional  space  to  represent  it  (four  dimensions  for  the  parameters  plus  one  extra 
dimension  for  the  scalar  value  of  the  probability^  associated). 

The  following  graph  represents  the  lines  of  same  expected  time  given  a  discrete 
set  of  scenarios  with  different  efficiency,  complexity,  and  requirements  volatility'-.  The 
graph  is  only  useful  to  visualize  the  combined  effect  of  the  three  parameters  of  the  model. 
Given  a  certain  scenario  and  a  confidence  probability  it  is  possible  to  determine  the 
expected  time  in  days.  For  instance,  the  comparison  of  HHH5  (high  efficiency,  high 
volatility,  high  complexity')  vice  LHH5  (low  efficiency  and  the  same  other  parameters) 
show  the  effect  of  efficiency. 
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I 


Contour  of  probabilities 


The  following  graph  represents  the  lines  of  same  probability  of  finishing  the 
project  at  a  given  date,  given  a  discrete  set  of  scenarios  with  different  efficiency, 
complexity,  and  requirements  volatility.  The  graph  is  only  useful  to  visualize  the 
combined  effect  of  the  three  parameters  of  the  model. 


100  200  300  400  500  600  700  800  DAYS 
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J.  Surface  of  cumulative  distribution 

The  following  graph  represents  a  3D  view'  of  the  cumulative  distributions  for  a 
discrete  set  of  scenarios.  The  z-axis  represents  the  cdf.  the  x-axis  represents  the  scenario 
(1-16),  and  the  y-axis  represents  the  time  (0-100). 
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K.  Snapshots  of  the  surface  of  cumulative  distribution  for  high  eflScienc}' 

The  following  series  of  graphs  represents  the  continuous  3D  aspect  of  the  five- 
dimension  model  given  a  high  efficiency  scenario  for  five  different  moments  in  time.  The 
axes  represent  complexity,  volatilit\’,  and  cdf  The  five  snapshots  represent  time  in  a 
discrete  way.  Efficiency  is  constant  and  high  for  all  the  graphs. 
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L.  Snapshots  of  the  surface  of  cumulative  distribution  for  high  efficiency 


The  following  series  of  graphs  represents  the  continuous  3D  aspect  of  the  five- 
dimension  model  given  a  high  efficiency  scenario  for  five  different  moments  in  time.  The 
axes  represent  complexity,  volatility,  and  cdf.  The  five  snapshots  represent  time  in  a 
discrete  way.  Efficiency  is  constant  and  low-  for  all  the  graphs. 


M-  Surface  of  cumulative  distribution  for  high  efBciencv  and  gamma-beta  ratio 

=  5.5 

The  following  graph  represents  the  cdf  surface  for  a  given  level  of  efficiency  and 
a  given  level  of  volatility.  The  three  axes  correspond  to  complexity,  time,  and  cdf.  This 
graph  predicts  the  fiiture  of  the  project  under  the  hypothesis  of  constant  volatility  and 
high  efficiency. 


Efficiency  =:  H  (alpha  ~  1,95,  beta-gamma  ratio  =  5.5) 
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N.  Surface  of  cumulative  distribution  for  low  efficiency  and  gamma-beta  ratio  = 
5.5 

The  following  graph  represents  the  cdf  surface  for  a  given  level  of  efficiency  and 
a  given  level  of  volatility.  The  three  axes  correspond  to  complexity,  time,  and  cdf  This 
graph  predicts  the  future  of  the  project  under  the  hypothesis  of  constant  volatility  and  low 
efficiency. 


Efficiency  =  Low  (alpha  =  2.5,  beta-gamma  ratio  =  5.5) 


Time  (25  -  800  days) 
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APPENDIX  F 


STOCHASTIC  DOMINANCE 


The  major  areas  of  application  of  dominance  have  been  finance,  insurance,  and 
economics.  The  classical  portfolio  problem  was  the  catalyst  for  the  initial  research.  From 
there  the  technique  was  applied  to  other  domains  (Whitmore  &  Findley,  1978).  Stochastic 
dominance  is  a  methodology  related  to  decision  theory.  It  is  based  on  formal  concepts 
and  theorems  and  employs  partial  information  on  the  decision-maker’s  preferences  and 
the  random  variables  to  produce  a  partial  ordering  (Levy,  1998). 

Definition  of  dominance:  Let  D  be  a  domain  constituted  by  a  set  of  decisions.  Let  d  e 
D.  We  say  that  the  decision  dj  dominates  the  domain  D  (expressed  as  dj  DOM  D),  if  and 
only  if  the  return  value  for  the  application  of  d  is  maximum  for  all  possible  values  of  x 
and  for  all  possible  dj  €  D. 

(Vd  6  D)(Vx  e  X)(R(x,  di)  >  R(x,  dj))  o  dj  DOM  D 

where  D  =  set  of  alternatives  or  decisions,  also  called  Feasible  Set  (F.S.) 

X  =  set  of  possible  values  for  the  random  value  x. 

R(x,  d)  =  a  function  that  measures  the  outcome  of  the  decision. 

Definition  of  Efficient  Set  (E.S.):  E.S.  is  the  set  of  dominating  decisions. 

(Vd  e  D)  (d  DOM  D)  =>  d  s  E.S. 

Definition  of  Inefficient  Set  (I.S.):  I.S.  is  the  set  of  dominated  decisions. 

F.S.  =  E.S.  u  I.S. 
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Definition  of  First  Degree  Stochastic  Dominance  (FSD):  FSD  is  the  dominance  that 
can  be  established  by  the  application  of  the  following  Theorem: 

Let  F(x)  and  G(x)  be  cumulative  distribution  functions  (cdf)  related  to  the  decisions  f  and 
g  respectively.  We  say  that  f  dominates  FSD  g  (f  FSD  g)  if  and  only  if  the  F(x)  >  G(x)  for 
all  values  of  x. 

(Vx)(F(x)>G(x))o(fFSDg) 

Observations: 

(1)  FSD  requires  that  distributions  do  not  intercept,  but  can  be  tangent. 

(2)  When  more  than  two  alternatives  exist,  the  mere  condition  of  being  dominated  by 
one  alternative  is  sufficient  condition  to  belong  to  I.S. 

(3)  All  alternatives  in  E.S.  must  intercept,  and  should  not  be  dominated. 

Figure  G.l  shows  an  example  of  inexistence  of  FSD.  cdfl  and  cdfZ  belong  to  E.S. 
cdf3  is  clearly  dominated  so  it  belongs  to  I.S.  Note  that  neither  cdfl  or  cdfZ 
dominates  each  other. 


Figure  G.l:  Concept  of  domination,  cdfl 
dominates  cdf3 
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Definition  of  sufficient  conditions  for  FSD:  Let  f,  g  be  two  alternatives  related  to  F(x) 
and  G(x)  respectively. 

(1)  We  say  that  f  dominates  first  degree  stochastic  g  (f  FSD  g),  if  the  maximiun  range 
of  F(x)  is  less  or  equal  the  minimum  range  of  G(x)  (Fig.  G.2). 

Max(Range(F(x)))  <  Min(Range(G(x)))  o  (f  FSD  g) 


Figure  G.2:  FSD  sufficient  condition.  Seriesl 
dominates  Series2. 


(2)  We  say  f  dominates  first  degree  stochastic  g  (f  FSD  g),  if  for  all  values  of  x  F(x) 
is  greater  or  equal  to  G(x)  (Fig.  G.3). 

(VxeX)(F(x)  >  G(x))  a  (3y€X)(F(y)  >  G(y)) 


Figure  G3:  FSD,  second  sufficient  condition. 
Seriesl  dominates  Series2. 
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Definition  of  Second  Degree  Stochastic  Dominance  (SSD):  SSD  is  the  dominance  that 
can  be  established  by  the  application  of  the  following  Theorem: 

Let  f,  g  be  two  alternatives  with  cdf  F(x)  and  G(x)  respectively.  We  say  that  f  dominates 
g  on  2"^*  degree  stochastic  dominance  (f  SSD  g),  if  and  only  if  the  area  between  the  two 
curves  is  positive. 

(f  SSD  g)  f  [F(x)  -  G(x)]  dx  >  0 

Observation 

(1)  Figure  G.3  represents  SSD. 

(2)  Figure  G.l  also  represents  SSD  of  cdf2  over  cdfl  if  we  assume  that  the  area  under 
cdfZ  is  greater  than  the  area  under  cdfl . 

Definition  of  sufficient  conditions  for  SSD:  FSD  is  sufficient  for  SSD. 

(fFSD  g)=^(fSSDg) 

Definition  of  Third  Degree  Stochastic  Dominance  (TSD):  The  third  degree  of 
stochastic  dominance  is  the  preference  for  positive  skewness  on  the  pdfs.  The  skewness 
(y)  is  defined  as  the  ratio  of  the  third  moment  over  the  standard  deviation  to  the  third. 

Y  =  [  J  f(x)  (x  -  p)^  dx  ]  / 

Definition  of  the  sufficient  conditions  for  TSD: 

(1)  FSD  is  sufficient  for  TSD. 

(2)  SSD  is  sufficient  for  TSD. 
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